## G. Cormode.
Encyclopedia entry on 'Count-Min Sketch'.
In L. Liu and M. T. Ozsu, editors, *Encyclopedia of Database
Systems*, pages 511-516. Springer, 2009.

The Count-Min (CM) Sketch is a compact summary data structure capable
of representing a high-dimensional vector and answering queries on this vector,
in particular point queries and dot product queries,
with strong accuracy guarantees.
Such queries are at the core of many computations, so the structure
can be used in order to answer a variety of other queries, such as
frequent items (heavy hitters), quantile finding, join size
estimation, and more.
Since the data structure can easily process updates in the form of
additions or subtractions to dimensions of the vector (which may
correspond to insertions or deletions, or other transactions), it is
capable of working over streams of updates, at high rates.
The data structure maintains the linear projection of the vector with
a number of other random vectors.
These vectors are defined implicitly by simple hash functions.
Increasing the range of the hash functions increases the accuracy of
the summary, and increasing the number of hash functions decreases the
probability of a bad estimate.
These tradeoffs are quantified precisely below.
Because of this linearity, CM sketches can be scaled, added and
subtracted, to produce summaries of the corresponding scaled and
combined vectors.

[ bib |
http |
.pdf ]
Back

*This file was generated by
bibtex2html 1.92.*