skip to main content
Caltech

Theory of Computing Seminar

Thursday, November 3, 2016
1:30pm to 2:30pm
Add to Cal
Annenberg 213
Computational Efficiency and Robust Statistics
Ilias Diakonikolas, USC,

Abstract:

Suppose we would like to estimate the mean of a 
high-dimensional Gaussian. It is easy to see that the sample mean is an 
accurate estimator of the mean. Now suppose that an adversary corrupts a 
small fraction of our samples. Can we still recover the true mean, up to 
a small error? A moment's thought reveals that the sample mean is not a 
good estimator in this noisy setting.

The concept of the Tukey median is a robust estimator of the mean in the 
noisy setting under quite general conditions. Unfortunately, it is hard 
to compute in the worst-case, even approximately. This prompts the 
following question: Can we reconcile robustness 
andcomputationalefficiency in high-dimensional distribution estimation?

In this talk, I will describe a polynomial time spectral technique to 
robustly estimate the mean of a high-dimensional Gaussian, up to 
*near-optimal* accuracy. Time permitting, I will also discuss a 
plausible computational barrier towards improving the performance of 
this algorithm.

The aforementioned spectral technique forms the basis of a general 
recipe to efficiently detect corruptions in high dimensions. As an 
application of this recipe, we obtain the first polynomial time 
algorithms for robustly learning various families of high-dimensional 
distributions, including Gaussians (with unknown mean and covariance), 
discrete product distributions, mixtures thereof (under some natural 
conditions), and certain families of graphical models.

The talk will be based on joint works with (different subsets of) G. 
Kamath, D. Kane, J. Li, A. Moitra, and A. Stewart.
For more information, please contact Thomas Vidick by email at [email protected].