### DIMACS Theoretical Computer Science Seminar

Title: Minimum Spanning Tree and Connectivity of Large Scale Graphs in MapReduce

Speaker: **Donatella Firmani**, Sapienza University of Rome

Date: Wednesday, May 9, 2012 11:00-12:00pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ

Abstract:

In the MapReduce framework, originally proposed by Google, and its open source implementation,
Hadoop, the computation is performed over a set of (key,value) pairs and consists of a map stage,
a shuffle stage and a reduce stage. During each round of the computation, these stages are executed
sequentially, while data-intensive operations executed by a single stage can be parallelized over
many machines. Assuming that the communication cost of the machines and their space usage must be
small with respect to the input size, the design of algorithms that minimize the number of rounds
becomes quite challenging. This work describes various solutions to the undirected graph
connectivity problem and minimum spanning tree, discussing recent results in literature
and highlighting open problems.

See: http://paul.rutgers.edu/~yixinxu/theory-spring12.html