Toward Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixture Models

May-28-2025, 13:07:06 GMT–Neural Information Processing Systems

We study the gradient Expectation-Maximization (EM) algorithm for Gaussian Mixture Models (GMM) in the over-parameterized setting, where a general GMM with n > 1 components learns from data that are generated by a single ground truth Gaussian distribution. While results for the special case of 2-Gaussian mixtures are well-known, a general global convergence analysis for arbitrary n remains unresolved and faces several new technical barriers since the convergence becomes sub-linear and non-monotonic. To address these challenges, we construct a novel likelihood-based convergence analysis framework and rigorously prove that gradient EM converges globally with a sublinear rate O(1/ t). This is the first global convergence result for Gaussian mixtures with more than 2 components. The sublinear convergence rate is due to the algorithmic nature of learning overparameterized GMM with gradient EM. We also identify a new emerging technical challenge for learning general over-parameterized GMM: the existence of bad local regions that can trap gradient EM for an exponential number of steps.

artificial intelligence, exp, machine learning, (16 more...)

Neural Information Processing Systems

May-28-2025, 13:07:06 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:
- Research Report
  - Experimental Study (0.93)
  - New Finding (0.68)

Industry:
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Statistical Learning (0.67)
  - Representation & Reasoning (1.00)