AITopics | expectation maximization

Mixture of linear regression is well studied in statistics and machine learning, where the data points are generated probabilistically using $k$ linear models. Algorithms like Expectation Maximization (EM) may be used to recover the ground truth regressors for this problem. Recently, in \cite{pal2022learning,ghosh_agnostic} the mixed linear regression problem is studied in the agnostic setting, where no generative model on data is assumed. Rather, given a set of data points, the objective is \emph{fit} $k$ lines by minimizing a suitable loss function. It is shown that a modification of EM, namely gradient EM converges exponentially to appropriately defined loss minimizer even in the agnostic setting. In this paper, we study the problem of \emph{fitting} $k$ parametric functions to given set of data points. We adhere to the agnostic setup. However, instead of fitting lines equipped with quadratic loss, we consider any arbitrary parametric function fitting equipped with a strongly convex and smooth loss. This framework encompasses a large class of problems including mixed linear regression (regularized), mixed linear classifiers (mixed logistic regression, mixed Support Vector Machines) and mixed generalized linear regression. We propose and analyze gradient EM for this problem and show that with proper initialization and separation condition, the iterates of gradient EM converge exponentially to appropriately defined population loss minimizers with high probability. This shows the effectiveness of EM type algorithm which converges to \emph{optimal} solution in the non-generative setup beyond mixture of linear regression.

artificial intelligence, linear regression, machine learning, (13 more...)

arXiv.org Machine Learning

2604.05842

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India > Maharashtra > Mumbai (0.04)
Asia > China (0.04)

Genre: Research Report (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Global Analysis of Expectation Maximization for Mixtures of Two Gaussians

Ji Xu, Daniel J. Hsu, Arian Maleki

Neural Information Processing SystemsMar-23-2026, 13:05:51 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, stationary point, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Global Analysis of Expectation Maximization for Mixtures of Two Gaussians

Neural Information Processing SystemsMar-17-2026, 08:57:10 GMT

Expectation Maximization (EM) is among the most popular algorithms for estimating parameters of statistical models. However, EM, which is an iterative algorithm based on the maximum likelihood principle, is generally only guaranteed to find stationary points of the likelihood objective, and these points may be far from any maximizer. This article addresses this disconnect between the statistical principles behind EM and its algorithmic properties. Specifically, it provides a global analysis of EM for specific models in which the observations comprise an i.i.d.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

1dba5eed8838571e1c80af145184e515-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 18:15:48 GMT

discrimination, historical encoder, maximization step, (10 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Add feedback

Bayes beats Cross Validation: Efficient and Accurate Ridge Regression via Expectation Maximization

Neural Information Processing SystemsDec-24-2025, 20:13:31 GMT

We present a novel method for tuning the regularization hyper-parameter, $\lambda$, of a ridge regression that is faster to compute than leave-one-out cross-validation (LOOCV) while yielding estimates of the regression parameters of equal, or particularly in the setting of sparse covariates, superior quality to those obtained by minimising the LOOCV risk. The LOOCV risk can suffer from multiple and bad local minima for finite $n$ and thus requires the specification of a set of candidate $\lambda$, which can fail to provide good solutions. In contrast, we show that the proposed method is guaranteed to find a unique optimal solution for large enough $n$, under relatively mild conditions, without requiring the specification of any difficult to determine hyper-parameters. This is based on a Bayesian formulation of ridge regression that we prove to have a unimodal posterior for large enough $n$, allowing for both the optimal $\lambda$ and the regression coefficients to be jointly learned within an iterative expectation maximization (EM) procedure. Importantly, we show that by utilizing an appropriate preprocessing step, a single iteration of the main EM loop can be implemented in $O(\min(n, p))$ operations, for input data with $n$ rows and $p$ columns.

cross validation, efficient and accurate ridge regression, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.86)

Add feedback

Global Analysis of Expectation Maximization for Mixtures of Two Gaussians

Neural Information Processing SystemsNov-21-2025, 14:52:37 GMT

Expectation Maximization (EM) is among the most popular algorithms for estimating parameters of statistical models. However, EM, which is an iterative algorithm based on the maximum likelihood principle, is generally only guaranteed to find stationary points of the likelihood objective, and these points may be far from any maximizer. This article addresses this disconnect between the statistical principles behind EM and its algorithmic properties. Specifically, it provides a global analysis of EM for specific models in which the observations comprise an i.i.d.

expectation maximization, global analysis, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Neural Expectation Maximization

Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber

Neural Information Processing SystemsNov-21-2025, 13:02:10 GMT

Many real world tasks such as reasoning and physical interaction require identification and manipulation of conceptual entities.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Federated Expectation Maximization with heterogeneity mitigation and variance reduction

Neural Information Processing SystemsAug-18-2025, 23:16:18 GMT

The Expectation Maximization (EM) algorithm is the default algorithm for inference in latent variable models. As in any other field of machine learning, applications of latent variable models to very large datasets makes the use of advanced parallel and distributed architectures mandatory. This paper introduces FedEM, which is the first extension of the EM algorithm to the federated learning context. FedEM is a new communication efficient method, which handles partial participation of local devices, and is robust to heterogeneous distributions of the datasets. To alleviate the communication bottleneck, FedEM compresses appropriately defined complete data sufficient statistics. We also develop and analyze an extension of FedEM to further incorporate a variance reduction scheme. In all cases, we derive finite-time complexity bounds for smooth non-convex problems. Numerical results are presented to support our theoretical findings, as well as an application to federated missing values imputation for biodiversity monitoring.

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Learning Diffusion Priors from Observations by Expectation Maximization

Neural Information Processing SystemsMay-27-2025, 10:49:06 GMT

Diffusion models recently proved to be remarkable priors for Bayesian inverse problems. However, training these models typically requires access to large amounts of clean data, which could prove difficult in some settings. In this work, we present a novel method based on the expectation-maximization algorithm for training diffusion models from incomplete and noisy observations only. Unlike previous works, our method leads to proper diffusion models, which is crucial for downstream tasks. As part of our method, we propose and motivate an improved posterior sampling scheme for unconditional diffusion models.

diffusion model, expectation maximization, observation

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians

Neural Information Processing SystemsJan-20-2025, 13:26:55 GMT

Quality: The paper is technically sound with non trivial results. The conclusions of the paper are well supported by the theory. The condition on the initial parameter that leads to convergence to the the true parameter are particularly interesting. Clarity: This is a very well written paper and it reads well. The math, though not trivial, is very accessible because of the presentation. The authors have provided adequate commentary that aids intuition and understanding.

algorithm, expectation maximization, global analysis, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Filters

Collaborating Authors

expectation maximization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Expectation Maximization (EM) Converges for General Agnostic Mixtures

Global Analysis of Expectation Maximization for Mixtures of Two Gaussians

Global Analysis of Expectation Maximization for Mixtures of Two Gaussians

1dba5eed8838571e1c80af145184e515-Supplemental.pdf

Bayes beats Cross Validation: Efficient and Accurate Ridge Regression via Expectation Maximization

Global Analysis of Expectation Maximization for Mixtures of Two Gaussians

Neural Expectation Maximization

Federated Expectation Maximization with heterogeneity mitigation and variance reduction

Learning Diffusion Priors from Observations by Expectation Maximization

Reviews: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians