Theory of Computing Seminar
The classical center based clustering problems such as
k-means/median/center assume that the optimal clusters satisfy the
locality property that the points in the same cluster are close to each
other. A number of clustering problems arise in machine learning where
the optimal clusters do not follow such a locality property. For
instance, consider the r-gather clustering problem where there is an
additional constraint that each of the clusters should have at least r
points or the capacitated clustering problem where there is an upper
bound on the cluster sizes. In this work, we consider an extreme
generalization of such problems by assuming that the target clustering
is an arbitrary partition of the dataset. We show some interesting
results in this scenario.
(This is a joint work with Anup Bhattacharya (IIT Delhi) and Amit Kumar
(IIT Delhi))
Arxiv link: http://arxiv.org/abs/1504.02564