High Quality Graph Partitioning
Peter Sanders and Christian Schulz, Karlsruhe Institute of Technology, Germany
We present an overview over our graph partitioners KaFFPa (Karlsruhe Fast Flow Partitioner) and KaFFPaE (KaFFPa Evolutionary). KaFFPa is a multilevel graph partitioning algorithm which on the one hand uses novel local improvement algorithms based on max-flow and min-cut computations and more localized FM searches and on the other hand uses more sophisticated global search strategies transferred from multi-grid linear solvers. KaFFPaE is a distributed evolutionary algorithm to solve the Graph Partitioning Problem. KaFFPaE uses KaFFPa which provides new effective crossover and mutation operators. By combining these with a scalable communication protocol we obtain a system that is able to improve the best known partitioning results for many inputs in a very short amount of time.
Graph Partitioning with Natural Cuts
Daniel Delling1, Andrew V. Goldberg1, Ilya Razenshteyn2, and Renato F. Werneck1
1 Microsoft Research Silicon Valley, USA 2 Lomonosov Moscow State University, Russia
We present a novel approach to graph partitioning based on the notion of natural cuts. Our algorithm, called PUNCH, has two phases. The first phase performs a series of minimum-cut computations to identify and contract dense regions of the graph. This reduces the graph size, but preserves its general structure. The second phase uses a combination of greedy and local search heuristics to assemble the final partition. The algorithm performs especially well on road networks, which have an abundance of natural cuts (such as bridges, mountain passes, and ferries). In a few minutes, it obtains excellent partitions for continental-sized networks.
Exact Combinatorial Branch-and-Bound for Graph Bisection
Daniel Delling1, Andrew V. Goldberg1, Ilya Razenshteyn2, and Renato F. Werneck1
1 Microsoft Research Silicon Valley, USA 2 Lomonosov Moscow State University, Russia
We present a novel exact algorithm for the minimum graph bisection problem, whose goal is to partition a graph into two equally- sized cells while minimizing the number of edges between them. Our algorithm is based on the branch-and-bound framework and, unlike most previous approaches, it is fully combinatorial. We present stronger lower bounds, improved branching rules, and a novel decomposition technique that contracts entire regions of the graph without losing optimality guarantees. In practice, our algorithm works particularly well on instances with relatively small minimum bisections, solving large real-world graphs (with tens of thousands to millions of vertices) to optimality.
Session 2: Graph Partitioning and Related
Graph Partitioning for Scalable Distributed Graph Computations
Aydin Buluc, Lawrence Berkeley National Laboratory and
Kamesh Madduri, The Pennsylvania State University, USA
Inter-node communication time constitutes a significant fraction of the execution time of graph algorithms on distributed-memory systems. Global computations on large-scale sparse graphs with skewed degree distributions are particularly challenging to optimize for, as prior work shows that it is difficult to obtain balanced partitions with low edge cuts for these graphs. In this work, we attempt to determine the optimal partitioning and distribution of such graphs, for load-balanced parallel execution of communication-intensive graph algorithms. We use Breadth-First Search (BFS) as a representative example, and derive upper bounds on the communication costs incurred with a two-dimensional partitioning of the graph. We present empirical results for communication costs with various graph partitioning strategies, and also obtain parallel BFS execution times for several large-scale DIMACS Challenge graph instances on a large supercomputing platform. Our performance results indicate that for several graph instances, reducing work and communication imbalance among partitions is more important than minimizing the total edge cut.
Shape Optimizing Load Balancing for Parallel Adaptive Numerical Simulations Using MPI
Henning Meyerhenke, Karlsruhe Institute of Technology, Germany
Load balancing is an important requirement for the efficient execution of numerical simulations on parallel computers. In particular when the simulation domain changes over time, the mapping of computational tasks to processors needs to be modified accordingly. Most state- of-the-art libraries addressing this problem are based on graph repartitioning with a parallel variant of the Kernighan-Lin (KL) heuristic. The KL approach has a number of drawbacks, including the optimized metric and solutions with undesirable properties.
Here we further explore the promising diffusion-based multilevel graph partitioning algorithm DibaP. We describe the evolution of the algorithm and report on its MPI implementation PDibaP for parallelism with distributed memory. PDibaP is targeted at small to medium scale parallelism with dozens of processors. The presented experiments use graph sequences that imitate adaptive numerical simulations. They demonstrate the applicability and quality of PDibaP for load balancing by repartitioning on this scale. Compared to the faster ParMETIS, PDibaPN"s solutions often have partitions with fewer external edges and a smaller communication volume in an underlying numerical simulation.
Scalable and Accurate Algorithm for Graph Clustering
Hristo N. Djidjev, Los Alamos National Labratory, and Melih
Onus, Cankaya University, Turkey
One of the most useful measures of quality for graph clustering is the modularity of the partition, which measures the difference between the number of the edges with endpoints in the same cluster and the expected number of such edges in a random graph. In this paper we show that the problem of finding a partition maximizing the modularity of a given graph G can be reduced to a minimum weighted cut problem on a complete graph with the same vertices as G. We then show that the resulting minimum cut problem can be efficiently solved by adapting existing graph partitioning tools. Our algorithm is accurate and finds a graph clusterings much faster than alternative algorithms that produce a comparable clustering quality.
Abusing a hypergraph partitioner for unweighted graph partitioning
B. O. Fagginger Auer and R. H. Bisseling, Utrecht University, Netherlands
We investigate using the Mondriaan matrix partitioner for
unweighted graph partitioning in the communication volume and edge-cut
metrics. By converting the unweighted graphs to appropriate matrices,
we measure Mondriaan's performance as a graph partitioner for the 10th
DIMACS challenge on graph partitioning and clustering. We find that
Mondriaan can effectively be used as a graph partitioner: on average,
w.r.t. the edge-cut metric, Mondriaan's average results are within 21%
of the best known results as listed in Chris Walshaw's partitioning
archive.
An Evaluation of the Zoltan Parallel Graph and Hypergraph Partitioners
Sivasankaran Rajamanickam and Erik G. Boman, Sandia National Laboratories, USA
Graph partitioning is an important and well studied problem in combinatorial scientific computing, and is commonly used to reduce communication in parallel computing. Different models (graph, hypergraph) and objectives (edge cut, boundary vertices) have been proposed. For large problems, the partitioning itself must be done in parallel. Several software packages, such as ParMetis, PT-Scotch and Zoltan are widely available. In this paper we evaluate the performance of the parallel graph and hypergraph (PHG) partitioner in the Zoltan toolkit. For the first time, we compare the performance of PHG as a graph and hypergraph partitioner across a diverse set of graphs from the 10th DIMACS implementation challenge.
UMPa: A Multi-objective, multi-level partitioner for communication minimization
Umit V. Catalyurek, Mehmet Deveci, Kamer Kaya, Ohio
State University and Bora Ucar, LIP, ENS Lyon, France
We propose a directed hypergraph model and a refinement heuristic
to distribute communicating tasks among the processing units in a
distributed memory setting. The aim is to achieve load balance and
minimize the maximum data sent by a processing unit. We also take two
other communication metrics into account with a tie-breaking
scheme. With this approach, task distributions causing an excessive
use of network or a bottleneck processor which participates to almost
all of the communication are avoided. We show on a large number of
problem instances that our model improves the maximum data sent by a
processor up to 34% for parallel environments with 4,16,64 and 256
processing units compared to the state of the art which only minimizes
the total communication volume.
A Divisive clustering technique for maximizing the modularity
Umit V. Catalyurek, Kamer Kaya, The Ohio State University
and Johannes Langguth and Bora Ucar, LIP, ENS Lyon, France
We present a new graph clustering algorithm aimed at obtaining clusterings of high modularity. The algorithm pursues a divisive clustering approach and using established graph partitioning algorithms and techniques to compute recursive bipartitions of the input as well as to refine clusters. Experimental evaluation shows that the modularity scores obtained compare favorably to many previous approaches. In the majority of test cases, the algorithm outperformed the best known alternatives. In particular, among 13 problem instances common in the literature, the proposed algorithm improves the best known modularity in 9 cases.
Modularity Maximization in Networks by Variable Neighborhood Search
Daniel Aloise, Universidade Federal do Rio Grande do Norte,
Brazil, Gilles Caporossi, Pierre Hansen, Sylvain Perron, GERAD & HEC
Montreal, Canada, Leo Liberti LIX, Ecole Polytechnique, France,
and Manuel Ruiz, INP-Grenoble, France
Finding communities, or clusters, in networks, or graphs, has been the subject of intense studies in the last ten years. The most used criterion for that purpose, despite some recent criticism, is modularity maximization, proposed by Newman and Girvan. It consists in maximizing the sum for all clusters of the number of inner edges minus the expected number of inner edges assuming the same distribution of degrees. Numerous heuristics, as well as a few exact algorithms have been proposed to maximize modularity. We apply the Variable Neighborhood Search metaheuristic to that problem. The resulting Variable Neighborhood Decomposition Search heuristic is applied to the Clustering problems of the 10th DIMACS Implementation Challenge. The optimal solution is obtained by the heuristic whenever it is known, and otherwise the best solutions from the literature are improved, except in the case of three instances for which available memory was too small.
Community Detection by Modularity Maximization using GRASP with Path Relinking
Maria C. V. Nascimento, Universidade Federal de Sao Paulo, Brazil and
Leonidas S. Pitsoulis, Aristotle University of Thessaloniki, Thessaloniki, Greece
Detection of community structure in graphs remains up to this date a computationally challenging problem despite the efforts of many researchers from various scientific fields in the past few years. The modularity value of a set of vertex clusters in a graph is a widely used quality measure for community structure, and the relating problem of finding a partition of the vertices into clusters such that the corre- sponding modularity is maximized is an NP-Hard problem. A Greedy Randomized Adaptive Search Procedure (GRASP) with path relinking is presented in this paper, for modularity maximization in undirected graphs. A new class of {0, 1} matrices is introduced which characterizes the family of clusterings in a graph, and a distance function is given which enables us to define an l-neighborhood local search which generalizes most of the related local search methods that have appeared in the literature. Computational experiments comparing the proposed algorithm with other heuristics from the literature in a set of some well known benchmark instances, indicate that our implementation of GRASP with path relinking consistently produces better quality solutions.
Using Stable Communities for Maximizing Modularity
S. Srinivasan and S. Bhowmick, University of Nebraska at Omaha, USA
Like many other combinatorial heuristics, the results of modularity maximization are affected by the order in which the vertices in the network are processed. However, there also exist stable communities, that is groups of vertices that remain in the same community, independent of the perturbations to the input graph. We present a preprocessing step that identifies pockets of stable communities and then combines them before executing a community detection algorithm. Our results show that this initial combination step can increase the modularity of the network.
Complete Hierarchical Community Clustering: A Case Study on Expansion and Modularity
Michael Hamann, Tanja Hartmann and Dorothea Wagner, Karlsruhe Institute of Technology, Germany
We present a simple and efficient method for constructing a community-clustering hierarchy as introduced by Flake et al. Community-clusterings depend on a parameter that provides a quality guarantee in terms of expansion, which is NP-hard to compute. For different parameter values the clusterings form a hierarchy. In this work we introduce a parametric search approach that guarantees the complete- ness of the resulting hierarchy and supersedes the necessity of choosing feasible parameter values in advance or applying a binary search to find those values. Our method is very easy to implement and in a brief running time experiment it turns out to be significantly faster than a binary search.
We further study the resulting clusterings with respect to their quality guarantee. In our experiments the actual quality of the clusterings is even better than guaranteed and compared to trivial expansion bounds the guarantee constitutes a true gain of knowledge. Further experiments focusing on modularity show that for some instances the algorithm of Flake et al. is very strict when assigning vertices to clusters, however, for the majority of the tested graphs the community-clusterings get close to reliable reference clusterings, although the algorithm is not designed to optimize modularity.
Keynote: DIMACS Implementation Challenges: Past, Present, and Future
David S. Johnson, AT&T Labs - Research
Abstract:
This talk is being presented at the workshop for the 10th DIMACS Implementation Challenge. In it, I will describe the history and impact of the Implementation Challenge series, and also talk about directions for the future.
Parallel Community Detection for Massive Graphs
E. Jason Riedy, David Ediger, David A. Bader, Georgia
Institute of Technology, and Henning Meyerhenke, Georgia
Institute of Technology, and Karlsruhe Institute of Technology, Germany
Tackling the current volume of graph-structured data requires
parallel tools. We extend our work on analyzing such massive graph
data with the first massively parallel algorithm for community
detection that scales to current data sizes, scaling to graphs of over
122 million vertices and nearly 2 billion edges in under 7300 seconds
on a massively multithreaded Cray XMT. Our algorithm achieves moderate
parallel scalability without sacrificing sequential operational
complexity. Community detection partitions a graph into subgraphs more
densely connected within the subgraph than to the rest of the
graph. We take an agglomerative approach similar to Clauset, Newman,
and Moore's sequential algorithm, merging pairs of connected
intermediate subgraphs to optimize different graph properties. Working
in parallel opens new approaches to high performance. On smaller data
sets, we find the output's modularity compares well with the standard
sequential algorithms.
Graph Coarsening and Clustering on the GPU
B. O. Fagginger Auer and R. H. Bisseling, Utrecht University, Netherlands
Agglomerative clustering is an effective greedy way to quickly generate graph clusterings of high modularity in a small amount of time. In an effort to use the power offered by multi-core CPU and GPU hardware to solve the clustering problem, we introduce a fine-grained shared- memory parallel graph coarsening algorithm and use this to implement a parallel agglomerative clustering heuristic on both the CPU and GPU. This heuristic is able to generate clusterings in very little time: a modularity 1.00 clustering is obtained from a street network graph with 14 million vertices and 17 million edges in six seconds on the GPU.
Experiments on Density-Constrained Graph Clustering
Robert Gorke, Andrea Schumm, and Dorothea Wagner, Karlsruhe Institute of Technology, Germany
Clustering a graph means identifying internally dense subgraphs which are only sparsely interconnected. Formalizations of this notion lead to measures that quantify the quality of a clustering and to algorithms that actually find clusterings. Since, most generally, corresponding optimization problems are hard, heuristic clustering algorithms are used in practice, or other approaches which are not based on an objective function. In this work we conduct a comprehensive experimental evaluation of the qualitative behavior of greedy bottom-up heuristics driven by cut-based objectives and constrained by intracluster density, using both real-world data and artificial instances. Our study documents that a greedy strategy based on local movement is superior to one based on merging. We further reveal that the former approach generally outperforms alternative setups and reference algorithms from the literature in terms of its own objective, while a modularity-based algorithm competes surprisingly well. Finally, we exhibit which combinations of cut-based inter- and intracluster measures are suitable for identifying a hidden reference clustering in synthetic random graphs.
An Ensemble Learning Strategy for Graph Clustering
Michael Ovelgonne and Andreas Geyer-Schulz, Karlsruhe Institute of Technology, Germany
This paper is on a graph clustering scheme inspired by ensemble learning. In short, the idea of ensemble learning is to learn several weak classifiers and use these weak classifiers to determine a strong classifier. In this contribution, we use the generic procedure of ensemble learning and determine several weak graph clusterings (with respect to the objective function). From the partition given by the maximal over- lap of these clusterings (the cluster cores), we continue the search for a strong clustering. We demonstrate the performance of this scheme by using it to maximize the modularity of a graph clustering. We show, that the quality of the initial weak clusterings is of minor importance for the quality of the final result of the scheme if we iterate the process of restarting from maximal overlaps.
Network Clustering via Clique Relaxations: A Community Based Approach
Anurag Verma, Sergiy Butenko, Texas A&M University, USA
In this paper, we present a general purpose network clustering algorithm based on a novel clique relaxation concept of k-community, which is defined as a connected subgraph such that endpoints of every edge have at least k common neighbors within the subgraph. A salient feature of this approach is that it does not use any prior information about the structure of the network. By defining a cluster as a k-community, the proposed algorithm aims to provide a clustering of a network into k-communities with varying values of k. Even though the algorithm is not designed to optimize any particular performance measure, the computational results suggest that it performs well on a number of criteria that are used in literature to evaluate the quality of a clustering.
Session 3: Hypergraph Partitioning
Session 4: Modularity Clustering (I)
Session 5: Modularity Clustering (II)
Session 6: Parallel Modularity Clustering
Session 7: Clustering
Previous: Program