DIMACS Theoretical Computer Science Seminar

Title: Learning Communities in the Presence of Errors

Speaker: Aravindan Vijayaraghavan, Northwestern University

Date: Wednesday, March 30, 2016 11:00am-12:00pm

Location: CoRE Bldg, Room 301, Rutgers University, Busch Campus, Piscataway, NJ

Abstract:

The Stochastic Block Model or the Planted Partition Model is the most widely used probabilistic model for community detection and clustering graphs in various fields, including machine learning, statistics, and social sciences. Many existing algorithms successfully learn the communities or clusters when the data is drawn exactly according to the model, but they do not work well in the presence of errors. In this talk, I will address the following question: Can we design robust polynomial time algorithms for learning probabilistic models for community detection that work in the presence of adversarial modelling errors?

I will present robust algorithms for (partially) recovering communities or clusters in graphs drawn from the Stochastic Block Model, in the presence of modeling errors or noise. These algorithms allow two types of adversarial errors: edge outlier errors (where an adversary can corrupt an arbitrary ϵ fraction of the edges), and any amount of Feige-Kilian or monotone errors. Mossel, Neeman and Sly (STOC 2015) posed an open question about whether an almost exact recovery is possible when the adversary is allowed to add o(n) edges. Our work answers this question affirmatively even in the case of k>2 communities.

Finally, I will describe how our algorithms recover the clusters, even when the modelling error is captured using Kullback-Leibler (KL) divergence: these algorithms work when the instances come from any distribution of graphs that is ϵm close to the Stochastic Block Model in the KL divergence (this result also handles adversarial errors).

This is based on joint work with Konstantin Makarchev and Yury Makarychev.

See: http://www.cs.rutgers.edu/~allender/theory-spring16.html