Finding frequent items in data streams, March 2009. Talk at DIMACS Working group on Streaming, Coding and Compressive Sensing; AT&T Labs; UMass Amherst; Dartmouth College.

The frequent items problem is to process a stream of items and find all items occurring more than a given fraction of the time. It is one of the most heavily studied problems in data stream mining, dating back to the 1980s. Many applications rely directly or indirectly on finding the frequent items, and implementations are in use in large scale industrial systems. In this talk, I survey the problem, and present the most important algorithm in a common framework. I'll also show some experimental comparisons, and discuss some more recent results which explain why some of these algorithms give even better results than their initial analysis led us to believe.

bib | slides ] Back

This file was generated by bibtex2html 1.92.