DIMACS TR: 2003-20
Improved Data Stream Summaries: The Count-Min Sketch and its Applications
Authors: Graham Cormode and S. Muthukrishnan
ABSTRACT
We introduce a new small space data structure -- the Count-Min
Sketch -- for summarizing data streams. Our sketch allows fundamental
queries in data stream summarization such as point, range and inner
product queries to be approximately answered quickly and efficiently; in
addition it can eb applied to solve several important problems in data
stream such as finding quantiles, frequent items, etc.
In both cases, the time and space bounds we show using the CM sketch
significantly improves those known for previous sketch constructions --
typically from 1/epsilon^2 to 1/epsilon in factor.
Paper Available at:
ftp://dimacs.rutgers.edu/pub/dimacs/TechnicalReports/TechReports/2003/2003-20.ps.gz
DIMACS Home Page