Sparse SVDs in Python
After Fabian's post on the topic, I have recently returned to thinking about the subject of sparse singular value decompositions (SVDs) in Python.
For those who haven't used it, the SVD is an extremely powerful technique. It is the core routine of many applications, from filtering to dimensionality reduction to graph analysis to supervised classification and much, much more.
I first came across the need for a fast sparse SVD when applying a technique called Locally Linear Embedding (LLE) to astronomy spectra: it was the first astronomy paper I published, and you can read it here. In LLE, one visualizes the nonlinear relationship between high-dimensional observations. The computational cost is extreme: for N objects, one must compute the null space (intimately related to the SVD) of a N by N matrix. Using direct methods (e.g. LAPACK), this can scale as bad as $\mathcal{O}[N^3]$ in both memory and speed!