DIMACS Theoretical Computer Science Seminar


Title: Minimum Spanning Tree and Connectivity of Large Scale Graphs in MapReduce

Speaker: Donatella Firmani, Sapienza University of Rome

Date: Wednesday, May 9, 2012 11:00-12:00pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ


Abstract:

In the MapReduce framework, originally proposed by Google, and its open source implementation, Hadoop, the computation is performed over a set of (key,value) pairs and consists of a map stage, a shuffle stage and a reduce stage. During each round of the computation, these stages are executed sequentially, while data-intensive operations executed by a single stage can be parallelized over many machines. Assuming that the communication cost of the machines and their space usage must be small with respect to the input size, the design of algorithms that minimize the number of rounds becomes quite challenging. This work describes various solutions to the undirected graph connectivity problem and minimum spanning tree, discussing recent results in literature and highlighting open problems.

See: http://paul.rutgers.edu/~yixinxu/theory-spring12.html