AITopics | true center

Collaborating Authors

true center

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Mixture of Gaussians with Streaming Data

Neural Information Processing SystemsNov-21-2025, 13:47:13 GMT

Our main contribution is the first bias-variance bound for the problem of learning Gaussian mixtures with streaming data.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > India (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning Mixture of Gaussians with Streaming Data

Neural Information Processing SystemsOct-4-2024, 10:17:01 GMT

In this paper, we study the problem of learning a mixture of Gaussians with streaming data: given a stream of N points in d dimensions generated by an unknown mixture of k spherical Gaussians, the goal is to estimate the model parameters using a single pass over the data stream. We analyze a streaming version of the popular Lloyd's heuristic and show that the algorithm estimates all the unknown centers of the component Gaussians accurately if they are sufficiently separated.

algorithm, gaussian, probability, (16 more...)

Neural Information Processing Systems

Country:

Asia > India (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Jigsaw Game: Federated Clustering

Xu, Jinxuan, Chen, Hong-You, Chao, Wei-Lun, Zhang, Yuqian

arXiv.org Artificial IntelligenceJul-17-2024

Federated learning has recently garnered significant attention, especially within the domain of supervised learning. However, despite the abundance of unlabeled data on end-users, unsupervised learning problems such as clustering in the federated setting remain underexplored. In this paper, we investigate the federated clustering problem, with a focus on federated k-means. We outline the challenge posed by its non-convex objective and data heterogeneity in the federated framework. To tackle these challenges, we adopt a new perspective by studying the structures of local solutions in k-means and propose a one-shot algorithm called FeCA (Federated Centroid Aggregation). FeCA adaptively refines local solutions on clients, then aggregates these refined solutions to recover the global solution of the entire dataset in a single round. We empirically demonstrate the robustness of FeCA under various federated scenarios on both synthetic and real-world data. Additionally, we extend FeCA to representation learning and present DeepFeCA, which combines Deep-Cluster and FeCA for unsupervised feature learning in the federated setting.

centroid, local solution, true center, (15 more...)

arXiv.org Artificial Intelligence

2407.12764

Country:

North America > United States > Ohio (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
North America > United States > Virginia (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Add feedback

Likelihood Landscape and Local Minima Structures of Gaussian Mixture Models

Chen, Yudong, Xi, Xumei

arXiv.org Machine LearningSep-27-2020

Mixture models, as exemplified by the Gaussian mixture model (GMM), are widely used for approximating complex multi-modal distributions. They can also be viewed as a form of latent variable models that provide a flexible approach for statistical inference with heterogeneous data. To estimate the parameters of GMM, a standard approach is via the maximum likelihood principle. When the global optimum of the likelihood function can be computed, the statistical properties of the maximum likelihood estimator is relatively well studied, including its asymptotic consistency [25] and finite-sample error rates [5, 21, 14]. Much less understood are the computational challenges associated with estimating GMMs. The negative log-likelihood function of GMM is nonconvex and in general has multiple local minima. Standard iterative algorithms, such as Expectation-Maximization (EM) [9], are only guaranteed to converge to a local minimum [27, 16].

artificial intelligence, equation, machine learning, (18 more...)

arXiv.org Machine Learning

2009.1304

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.74)

Add feedback

Structures of Spurious Local Minima in $k$-means

Qian, Wei, Zhang, Yuqian, Chen, Yudong

arXiv.org Machine LearningFeb-21-2020

$k$-means clustering is a fundamental problem in unsupervised learning. The problem concerns finding a partition of the data points into $k$ clusters such that the within-cluster variation is minimized. Despite its importance and wide applicability, a theoretical understanding of the $k$-means problem has not been completely satisfactory. Existing algorithms with theoretical performance guarantees often rely on sophisticated (sometimes artificial) algorithmic techniques and restricted assumptions on the data. The main challenge lies in the non-convex nature of the problem; in particular, there exist additional local solutions other than the global optimum. Moreover, the simplest and most popular algorithm for $k$-means, namely Lloyd's algorithm, generally converges to such spurious local solutions both in theory and in practice. In this paper, we approach the $k$-means problem from a new perspective, by investigating the structures of these spurious local solutions under a probabilistic generative model with $k$ ground truth clusters. As soon as $k=3$, spurious local minima provably exist, even for well-separated and balanced clusters. One such local minimum puts two centers at one true cluster, and the third center in the middle of the other two true clusters. For general $k$, one local minimum puts multiple centers at a true cluster, and one center in the middle of multiple true clusters. Perhaps surprisingly, we prove that this is essentially the only type of spurious local minima under a separation condition. Our results pertain to the $k$-means formulation for mixtures of Gaussians or bounded distributions. Our theoretical results corroborate existing empirical observations and provide justification for several improved algorithms for $k$-means clustering.

equation, local minimum, voronoi, (15 more...)

arXiv.org Machine Learning

2002.06694

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Learning Mixture of Gaussians with Streaming Data

Raghunathan, Aditi, Jain, Prateek, Krishnawamy, Ravishankar

Neural Information Processing SystemsDec-31-2017

In this paper, we study the problem of learning a mixture of Gaussians with streaming data: given a stream of $N$ points in $d$ dimensions generated by an unknown mixture of $k$ spherical Gaussians, the goal is to estimate the model parameters using a single pass over the data stream. We analyze a streaming version of the popular Lloyd's heuristic and show that the algorithm estimates all the unknown centers of the component Gaussians accurately if they are sufficiently separated. Assuming each pair of centers are $C\sigma$ distant with $C=\Omega((k\log k)^{1/4}\sigma)$ and where $\sigma^2$ is the maximum variance of any Gaussian component, we show that asymptotically the algorithm estimates the centers optimally (up to certain constants); our center separation requirement matches the best known result for spherical Gaussians \citep{vempalawang}. For finite samples, we show that a bias term based on the initial estimate decreases at $O(1/{\rm poly}(N))$ rate while variance decreases at nearly optimal rate of $\sigma^2 d/N$. Our analysis requires seeding the algorithm with a good initial estimate of the true cluster centers for which we provide an online PCA based clustering algorithm. Indeed, the asymptotic per-step time complexity of our algorithm is the optimal $d\cdot k$ while space complexity of our algorithm is $O(dk\log k)$. In addition to the bias and variance terms which tend to $0$, the hard-thresholding based updates of streaming Lloyd's algorithm is agnostic to the data distribution and hence incurs an \emph{approximation error} that cannot be avoided. However, by using a streaming version of the classical \emph{(soft-thresholding-based)} EM method that exploits the Gaussian distribution explicitly, we show that for a mixture of two Gaussians the true means can be estimated consistently, with estimation error decreasing at nearly optimal rate, and tending to $0$ for $N\rightarrow \infty$.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Add feedback

Learning Mixture of Gaussians with Streaming Data

Raghunathan, Aditi, Krishnaswamy, Ravishankar, Jain, Prateek

arXiv.org Machine LearningJul-7-2017

In this paper, we study the problem of learning a mixture of Gaussians with streaming data: given a stream of $N$ points in $d$ dimensions generated by an unknown mixture of $k$ spherical Gaussians, the goal is to estimate the model parameters using a single pass over the data stream. We analyze a streaming version of the popular Lloyd's heuristic and show that the algorithm estimates all the unknown centers of the component Gaussians accurately if they are sufficiently separated. Assuming each pair of centers are $C\sigma$ distant with $C=\Omega((k\log k)^{1/4}\sigma)$ and where $\sigma^2$ is the maximum variance of any Gaussian component, we show that asymptotically the algorithm estimates the centers optimally (up to constants); our center separation requirement matches the best known result for spherical Gaussians \citep{vempalawang}. For finite samples, we show that a bias term based on the initial estimate decreases at $O(1/{\rm poly}(N))$ rate while variance decreases at nearly optimal rate of $\sigma^2 d/N$. Our analysis requires seeding the algorithm with a good initial estimate of the true cluster centers for which we provide an online PCA based clustering algorithm. Indeed, the asymptotic per-step time complexity of our algorithm is the optimal $d\cdot k$ while space complexity of our algorithm is $O(dk\log k)$. In addition to the bias and variance terms which tend to $0$, the hard-thresholding based updates of streaming Lloyd's algorithm is agnostic to the data distribution and hence incurs an approximation error that cannot be avoided. However, by using a streaming version of the classical (soft-thresholding-based) EM method that exploits the Gaussian distribution explicitly, we show that for a mixture of two Gaussians the true means can be estimated consistently, with estimation error decreasing at nearly optimal rate, and tending to $0$ for $N\rightarrow \infty$.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1707.02391

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences

Jin, Chi, Zhang, Yuchen, Balakrishnan, Sivaraman, Wainwright, Martin J., Jordan, Michael

arXiv.org Machine LearningSep-4-2016

We provide two fundamental results on the population (infinite-sample) likelihood function of Gaussian mixture models with $M \geq 3$ components. Our first main result shows that the population likelihood function has bad local maxima even in the special case of equally-weighted mixtures of well-separated and spherical Gaussians. We prove that the log-likelihood value of these bad local maxima can be arbitrarily worse than that of any global optimum, thereby resolving an open question of Srebro (2007). Our second main result shows that the EM algorithm (or a first-order variant of it) with random initialization will converge to bad critical points with probability at least $1-e^{-\Omega(M)}$. We further establish that a first-order variant of EM will not converge to strict saddle points almost surely, indicating that the poor performance of the first-order method can be attributed to the existence of bad local maxima rather than bad saddle points. Overall, our results highlight the necessity of careful initialization when using the EM algorithm in practice, even when applied in highly favorable settings.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1609.00978

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.74)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback