Clustering a Mixture of Gaussians with Unknown Covariance

Davis, Damek, Diaz, Mateo, Wang, Kaizheng

Oct-4-2021–arXiv.org Machine Learning

We investigate a clustering problem with data from a mixture of Gaussians that share a common but unknown, and potentially ill-conditioned, covariance matrix. We start by considering Gaussian mixtures with two equally-sized components and derive a Max-Cut integer program based on maximum likelihood estimation. We prove its solutions achieve the optimal misclassification rate when the number of samples grows linearly in the dimension, up to a logarithmic factor. However, solving the Max-cut problem appears to be computationally intractable. To overcome this, we develop an efficient spectral algorithm that attains the optimal rate but requires a quadratic sample size. Although this sample complexity is worse than that of the Max-cut problem, we conjecture that no polynomial-time method can perform better. Furthermore, we gather numerical and theoretical evidence that supports the existence of a statistical-computational gap. Finally, we generalize the Max-Cut program to a $k$-means program that handles multi-component mixtures with possibly unequal weights. It enjoys similar optimality guarantees for mixtures of distributions that satisfy a transportation-cost inequality, encompassing Gaussian and strongly log-concave distributions.

algorithm, equation, inequality, (15 more...)

arXiv.org Machine Learning

Oct-4-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.04)
  - New York > New York County
    - New York City (0.04)
- Europe > United Kingdom
  - England
    - Oxfordshire > Oxford (0.04)
    - Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.54)
  - Machine Learning
    - Statistical Learning > Clustering (0.87)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.54)