## Compact summaries over large datasets, May 2015.
Invited tutorial in PODS 2015 and BICOD 2015.

A fundamental challenge in processing the massive quantities of information generated by modern
applications is in extracting suitable representations of the data that can be stored, manipulated and
interrogated on a single machine. A promising approach is in the design and analysis of compact
summaries: data structures which capture key features of the data, and which can be created effectively
over distributed data sets. Popular summary structures include the
count distinct algorithms, which
compactly approximate item set cardinalities, and sketches which allow vector norms and products to be estimated.
These are very attractive, since they can be computed in parallel and combined to yield
a single, compact summary of the data.
This tutorial introduces the concepts and examples of compact
summaries.

[ bib |
slides |
.pdf ]
Back

*This file was generated by
bibtex2html 1.92.*