## G. Cormode and S. Muthukrishnan.
Space efficient mining of multigraph streams.
In *ACM Principles of Database Systems (PODS)*, pages
271-282, 2005.

The challenge of monitoring massive amounts of data generated by
communication networks has led to the interest in data stream
processing.
We study streams of edges in massive communication multigraphs, defined by
(source, destination) pairs.
The goal is to compute properties of the underlying graph while using small
space (much smaller than the number of communicants),
and to avoid bias introduced because some edges may appear many times,
while others are seen only once.
We give results for three fundamental problems on multigraph degree sequences:
estimating frequency moments of degrees, finding the heavy hitter
degrees, and computing range sums of degree values.
In all cases we are able to show space bounds
for our summarizing algorithms
that are significantly smaller than storing complete information.
We use a variety of data stream methods:
sketches, sampling, hashing and distinct
counting, but a common feature is that we use the approach of
*cascaded summaries*: nesting multiple estimation techniques within one
another.
In our experimental study, we see that such summaries are highly
effective, enabling massive multigraph streams to be effectively
summarized to answer queries of interest with high accuracy using only
a small amount of space.

[ bib |
.pdf |
slides |
.pdf ]
Back

*This file was generated by
bibtex2html 1.92.*