Toy store
There are so many interesting things to try. Here’s a short list; it will most probably grow with time. Some of the items may one day turn into an article. What would you like to read about in more detail?
Deep learning and neural networks
Scattering
Scattering is an approach to deep learning based on wavelets, developed in France by a team led by Stéphane Mallat. Here’s a rather good talk by Mallat describing challenges in deep learning and how scattering addresses them.
Gaussian Processes
George: Gaussian processes in O( n * log2 n )
Normal time complexity for GP is O( n3 ). People have tried to overcome it by various means like approximations. This paper, Fast Direct Methods for Gaussian Processes, describes George - a Python implementation of Gaussian Processes using a fast solver called HODLR.
Gaussian process kernels for pattern discovery and extrapolation
The guys from Cambridge are up to doing some nifty stuff. One example is Andrew Wilson and his extrapolation kernels. This is the project page with papers and code.
Optimization
Fast large-scale optimization by unifying stochastic gradient and quasi-Newton methods
Few hyperparams to tune, unlike with SGD. The paper and the code.
Stochastic Gradient Hamiltonian Monte Carlo
What about training a neural network using gradient descent with Monte Carlo added for good measure? The paper and code.
Software packages
Factorization machines
Matrix factorization for recommender system usually involves users and items - a matrix of ratings. Some more ambitious attempts add time as the third dimension, resulting in a 3D tensor.
Factorization machines generalize matrix factorization to any number of dimensions. The author, Steffen Rendle, has released his libFM. Apparently there’s also FM implementation in BidMach (see below).
When asked about influential papers in machine learning, Peter Norvig mentioned Rendle’s Factorization Machines as the first.
LIBSVM cousins
LIBSVM might be a bit boring, but it’s a solid software and these guys have several other less known and possibly more interesting packages, all very fast apparently. Among them:
- LIBMF - a matrix factorization library for recommender systems
- DC-SVM - Divide-and-Conquer kernel (non-linear) SVM, again very fast and accurate if you believe the paper
TMVA
TMVA is a machine learning toolkit from actual “rocket scientists” in CERN. It features a number of algorithms, including rarely seen RuleFit.
Shogun
Shogun is a mature machine learning toolbox. It’s written in C/C++ for efficiency, but has bindings for many languages including Python and Matlab.
BidMach
BidMach is apparently a very fast - scratch that, insanely fast - learning system in Scala from Berkeley. It features linear models, factor models and k-means.
Software using GPU
Quite a few packages listed in Running things on a GPU.
Software for optimizing hyperparams
Quite a few packages listed in Optimizng hyperparams with hyperopt.
Research code
Pedro Domingos’ software
Pedro Domingos’ research seems to focus in two areas:
- statistical relational learning, embodied in Alchemy. Alchemy looks like an attempt to bring together logic/AI and machine learning using Markov logic networks.
- sum-product networks, a different form of deep learning with fast exact probabilistic inference over many layers.
GloVe: global vectors for word representation
Representing words as vectors is a hot topic. GloVe is apparently the latest and greatest approach. The author, Jeffrey Pennington, has the paper, code and some vectors for you.
Trees
Regularized Greedy Forest
RGF is a software by Rie Johnson and Tong Zhang, winners of the Heritage Health Prize and Bond Trade Price Challenge at Kaggle. They also have Stochastic variance reduced gradient code.
eXtreme Gradient Boosting
XGBoost is an efficient general purpose gradient boosting library. It enjoys some popularity among Kaggle competitors.
Hybrid extreme rotation forest
The HERF combines random forest with extreme learning machines. The abstract says:
In the HERF algorithm, training of each individual classifier involves two steps: first computing a randomized data rotation transformation of the training data, second, training the individual classifier on the rotated data.
The paper is paywalled. There’s some scikit-learn style Python code by the author.
Rotation forest
An implementation of the “Rotation Forest” algorithm from Rodriguez et al. 2006 in R. Every tree has the features randomly split into k subsets and Principal Component Analysis is applied separately to each subset.
CloudForest
Fast, flexible, multi-threaded ensembles of decision trees for machine learning in pure Go (golang). GitHub