G. Cormode and P. Indyk. Stable distributions in streaming computations. In M. Garofalakis, J. Gehrke, and R. Rastogi, editors, Data Stream Management: Processing High-Speed Data Streams. Springer, 2016.

In many streaming scenarios, we need to measure and quantify the data that is seen. For example, we may want to measure the number of distinct IP addresses seen over the course of a day, compute the difference between incoming and outgoing transactions in a database system or measure the overall activity in a sensor network. More generally, we may want to cluster readings taken over periods of time or in different places to find patterns, or find the most similar signal from those previously observed to a new observation. For these measurements and comparisons to be meaningful, they must be well-defined. Here, we will use the well-known and widely used Lp norms. These encompass the familiar Euclidean (root of sum of squares) and Manhattan (sum of absolute values) norms.

In this chapter, we will show how Lp distances can be estimated for massive vectors presented in the streaming model. This is achieved by making succinct sketches of the data, which can be used as synopses of the vectors they summarize.

bib | Alternate Version ] Back


This file was generated by bibtex2html 1.92.