DIMACS Theoretical Computer Science Seminar

Title: Mergeable Summaries

Speaker: Graham Cormode, AT&T Labs Research

Date: Wednesday, November 16, 2011 11:00-12:00pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ


In dealing with massive distributed data, exact computations may be possible but can still be very expensive to perform. Instead, it is often desirable to look to more lightweight computations that give (guaranteed) approximations. A central approach is the idea of the mergeable summary: a compact data structure that summarizes a data set, which can be merged with others to summarize the union of the data without significant growth in size. This framework enables parallel computation. Samples and sketches are two examples of well-known, effective mergeable summaries. I'll present recent results on new, compact mergeable summaries for various tasks, such as approximating frequent items, order statistics, and range counts.

See: http://paul.rutgers.edu/~yixinxu/theory-fall11.html