G. Cormode and S. Muthukrishnan. Estimating dominance norms of multiple data streams. Technical Report 2002-35, Center for Discrete Mathematics and Computer Science (DIMACS), 2002.

Data streams often consist of multiple signals. Consider a stream of multiple signals (i,a_i,j) where i's correspond to the domain, j's index the different signals and a_i,j >= 0 to the value of the j'th signal at point i. We study the problem of determining the dominance norms over the multiple signals, in particular the max-dominance norm, defined as sum_i max_j a_i,j. It is used in applications to estimate the “worst case influence” of multiple processes, for example in IP traffic analysis, electrical grid monitoring and financial domain. Besides finding many applications, it is a natural measure: it generalizes the notion of union of data streams and may be alternately thought of as estimating the L1 norm of the upper envelope of multiple signals.

We present the first known data stream algorithm for estimating max-dominance of multiple signals. In particular, we use workspace and time-per-item that are both sublinear (in fact, poly-logarithmic) in the input size. The algorithm is simple and implementable; its analysis relies on using properties of stable random distributions with small parameter alpha, which may be a technique of independent interest. In contrast we show that other dominance norms - min-dominance sum_i min_j a_i,j, count-dominance (|i | a_i > b_i|) or relative-dominance (sum_i a_i/max1,b_i), are all impossible to estimate accurately with sublinear space.

bib | .html | Alternate Version | .ps.gz ] Back

This file was generated by bibtex2html 1.92.