TCS+ Talk
Abstract: The MPC model has emerged as a compelling model for modern parallel systems, such as MapReduce, Hadoop and Spark, capturing well coarse-grained computation on large data. Here, data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and the recent research program has been to design algorithms whose running time is smaller than in the PRAM model.
One now-classic challenge is the graph connectivity problem. On an undirected graph with $n$ nodes and $m$ edges, $O(\log n)$ round connectivity algorithms have been known for over 35 years. However, no algorithms with better complexity bounds are known for MPC (in fact even impossible for some restricted algorithms).
We give a faster algorithms for the connectivity problem when parameterizing the time complexity as a function of the diameter of the graph. Our main result is a O(log D * loglog_{m/n} n) time connectivity algorithm for diameter-D graphs, using O(m) total memory.
Joint work with Zhao Song, Cliff Stein, Zhengyu Wang, Peilin Zhong (to appear in FOCS'18).