## G. Cormode, F. Korn, and S. Tirthapura.
Time decaying aggregates in out-of-order streams.
Technical Report 2007-10, Center for Discrete Mathematics and
Computer Science (DIMACS), 2007.

Processing large data streams is now a major topic in data management.
The data involved can be truly massive, and the required analyses
complex. In a stream of sequential events such as stock feeds, sensor
readings, or IP traffic measurements, tuples pertaining to recent events
are typically more important than older ones. This can be formalized via
time decay functions, which assign weights based on age. Decay functions
such as sliding windows and exponential decay have been well studied
under the assumption of well-ordered arrivals, i.e., data arrives in the
order of increasing time stamps. However, data quality issues are
prevalent in massive streams (due to network asynchrony and delays or
possibly due to features inherent to the measurement process), and
correct order of arrival cannot be guaranteed. We focus on the
computation of decayed aggregates such as range queries, quantiles, and
heavy hitters on out-of-order streams, where elements do not necessarily
arrive in increasing order of timestamps. We give the first deterministic
algorithms for approximating these aggregates under the sliding window,
exponential and polynomial decay functions. Our techniques are based on
extending existing data stream summaries, such as the q-digest. We show
that the overhead for allowing out-of-order arrivals is small when
compared to the case of well-ordered arrivals. Our experimental study
confirms that these methods can be applied in practice and investigates
how the specific decay function impacts performance.

[ bib |
Alternate Version ]
Back

*This file was generated by
bibtex2html 1.92.*