PyQuant News 🐍
PyQuant News 🐍

@pyquantnews

20 Tweets 11 reads Oct 02, 2022
College completely failed to teach me data analysis outside Excel.
So I spent over 5,000 hours learning Python.
Then, I picked the 16 best libraries for machine learning and data analysis.
But unlike college, these won't cost you $60,000.
Here they are for free:
AutoViz
AutoViz performs automatic visualization of any dataset with a single line of Python code. Give it any input file (CSV, txt or json) of any size and AutoViz will visualize it.
github.com
Numba
Numba translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN.
numba.pydata.org
scikit-learn
scikit-learn is an open-source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection, model evaluation, and many other utilities.
scikit-learn.org
NetworkX
NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
networkx.org
pandas
pandas is a fast, powerful, flexible, and easy-to-use open source data analysis and manipulation tool, built on top of the Python programming language.
pandas.pydata.org
Vaex
Vaex is a high-performance Python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets.
github.com
SciPy
SciPy is a collection of mathematical algorithms and convenience functions built on the NumPy extension of Python. With SciPy, an interactive Python session becomes a data-processing and system-prototyping environment.
scipy.org
XGBoost
XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable. It implements machine learning algorithms under the Gradient Boosting framework.
xgboost.readthedocs.io
PyMC
PyMC is a Python package for Bayesian statistical modeling focusing on advanced Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms. Its flexibility and extensibility make it applicable to a large suite of problems.
github.com
statsmodels
statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models.
github.com
bokeh
Bokeh is an interactive visualization library for modern web browsers. It provides elegant, concise construction of versatile graphics, and affords high-performance interactivity over large or streaming datasets.
github.com
Blaze
Blaze translates a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze allows Python users a familiar interface to query data living in other data storage systems.
github.com
SparklingPandas
SparklingPandas aims to make it easy to use the distributed computing power of PySpark to scale your data analysis with Pandas. SparklingPandas builds on Spark's DataFrame class to give you a polished, pythonic, and Pandas-like API.
github.com
Superset
Superset is a modern data exploration and data visualization platform. Superset can replace or augment proprietary business intelligence tools for many teams. Superset integrates well with a variety of data sources.
github.com
PyCM
PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters.
github.com
Plotly Dash
Built on top of Plotly.js, React, and Flask, Dash ties modern UI elements like dropdowns, sliders, and graphs directly to your analytical Python code.
github.com
Keep your $60,000.
Learn Python:
• Vaex
• SciPy
• Blaze
• PyMC
• bokeh
• PyCM
• Numba
• AutoViz
• pandas
• XGBoost
• Superset
• NetworkX
• Plotly Dash
• scikit-learn
• statsmodels
• SparklingPandas
There's a college degree packed into one thread!
Can't get to it all today?
Hop back to the top and retweet the top tweet so you can find it later - and so others can find it too!
If you like Tweets about getting started with Python for quant finance, you might enjoy my weekly newsletter: The PyQuant Newsletter.
Real Python code for quant finance you can use now.
Join 5,400+ subscribers who are taking action with Python.
pyquantnews.com

Loading suggestions...