« Fully Scalable MPC Algorithms for Clustering in High Dimension
May 09, 2024, 10:45 AM - 11:15 AM
Location:
DIMACS Center
Rutgers University
CoRE Building
96 Frelinghuysen Road
Piscataway, NJ 08854
Click here for map.
Robert Krauthgamer, Weizmann Institute of Science
We design new algorithms for k-clustering in high-dimensional Euclidean spaces. These algorithms run in the Massively Parallel Computation (MPC) model, and are fully scalable, meaning that the local memory in each machine is an arbitrarily small poly(n) for input size n, which importantly may be substantially smaller than k. Our algorithms are fast, i.e., take O(1) rounds, and achieve O(1)-bicriteria approximation for k-Median and for k-Means, while previous work achieves only polylog(n)-approximation or handles special cases.
Our results rely on a fast MPC algorithm for O(1)-approximation of facility location. A primary technical tool that we develop, and may be of independent interest, is a new MPC primitive for geometric aggregation, namely, computing certain statistics on an approximate neighborhood of every data point, which includes range counting and nearest-neighbor search.
Joint work with Artur Czumaj, Guichen Gao, Shaofeng H.-C. Jiang, and Pavel Vesely.
[Video]