September 17, 2019, 2:00 PM - 2:40 PM
Busch Campus Student Center
604 Bartholomew Rd
Click here for map.
Ravi Kannan, Microsoft Research
We formulate a geometric problem called Latent Simplex Learning (LSL) which includes as special cases the core problem in several Unsupervised Learning settings such as Topic Modeling and Mixed Community Block Membership models. LSL is simply stated:find a latent k-simplex K in Rd given n data points, each obtained by perturbing a latent point in K. The perturbations are large and most data points lie (far) outside K.
Under natural assumptions, which we show hold in the applications, we devise a algorithm to solve LSL which runs in time O*(k nnz) matching the best running time of algorithms in special cases. We use a new technique we call "subset smoothing" which optimizes k carefully chosen linear functions over the convex hull of (n δn) points, obtained by averaging all δn subsets of the data points. We prove that each optimization yields an approximation to a new vertex of K. The proof is non-trivial and uses existing and new tools from Numerical Analysis, especially angles between singular spaces of close-by matrices.
Joint work with C. Bhattacharyya.