Matrix Sketching over Streams

February 07, 2018, 11:00 AM - 12:00 PM

Location:

Conference Room 301

Rutgers University

CoRE Building

96 Frelinghuysen Road

Piscataway, NJ 08854

Mina Ghashami, Rutgers University

It is common to represent data in the form of a matrix, and a large set of data analytic tasks rely on obtaining a low-rank approximation of the data matrix. Such approximations can be computed using the Singular Value Decompositions (SVD). In many scenarios, however, data matrices are extremely large and computing their SVD exactly is infeasible. Efficient approximate solutions exist for distributed setting or when data access otherwise is limited. In the data streaming model, the data points are presented to the algorithm one by one in an arbitrary order. The algorithm is tasked with processing the stream in one pass while being severely restricted in its memory footprint. At the end of the stream, the algorithm must provide a sketch matrix which is a good approximation of the original data.

In this talk, we will discuss two recent matrix sketching methods over data streams.