Python Tools for Coding and Feature Learning; SciPy 2013 Presentation
Authors: Johnson, Leif, University of Texas at Austin
Track: Machine Learning
Sparse coding and feature learning have become popular areas of research in machine learning and neuroscience in the past few years, and for good reason: sparse codes can be applied to real-world data to obtain "explanations" that make sense to people, and the features used in these codes can be learned automatically from unsupervised datasets. In addition, sparse coding is a good model for the sorts of data processing that happens in some areas of the brain that process sensory data (Olshausen & Field 1996, Smith & Lewicki 2006), hinting that sparsity or redundancy reduction (Barlow 1961) is a good way of representing raw, real-world signals.
In this talk I will summarize several algorithms for sparse coding (k-means [MacQueen 1967], matching pursuit [Mallat & Zhang 1994], lasso regression [Tibshirani 1996], sparse neural networks [Lee Ekanadham & Ng 2008, Vincent & Bengio 2010]) and describe associated algorithms for learning dictionaries of features to use in the encoding process. The talk will include pointers to several nice Python tools for performing these tasks, including standard scipy function minimization, scikit-learn, SPAMS, MORB, and my own packages for building neural networks. Many of these techniques converge to the same or qualitatively similar solutions, so I will briefly mention some recent results that indicate the encoding can be more important than the specific features that are used (Coates & Ng, 2011).