DIMACS - Graduate Student Combinatorics Seminar


Title: Hierarchical Graph Maps

Speaker: James Abello, DIMACS

Date: Wednesday, December 2, 2009 12:10pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ


Abstract:

A variety of massive data sets exhibit an underlying structure that can be modeled as dynamic weighted multi-digraphs. Their sizes range from tens of gigabytes to petabytes. These include the World Wide Web, Internet Traffic and Telephone Call Detail. These data sets sheer volume brings with it a series of computational and visualization challenges due mainly to the I/O and Screen Bottlenecks. We present external memory algorithms for connectivity and minimum spanning trees together with heuristics for quasi-clique finding. We describe how hierarchical Graph Maps help us to cope in a unified manner with both, the I/O and screen bottlenecks. This line of research has suggested the need to look for "novel" graph representations in order to provide a user or a data-mining engine with informed navigation paths. We will discuss our results with graphs having on the order of 100 million vertices and several billion edges and we will point out some mathematical problems that have surfaced along the way. The overall goal is to extract useful information that can be brought into a user's palm top and to export these techniques to other mining domains.