AITopics | Bowei Yan

Clustering is an important unsupervised learning problem in machine learning and statistics. Among many existing algorithms, kernel k-means has drawn much research attention due to its ability to find non-linear cluster boundaries and its inherent simplicity. There are two main approaches for kernel k-means: SVD of the kernel matrix and convex relaxations. Despite the attention kernel clustering has received both from theoretical and applied quarters, not much is known about robustness of the methods. In this paper we first introduce a semidefinite programming relaxation for the kernel clustering problem, then prove that under a suitable model specification, both K-SVD and SDP approaches are consistent in the limit, albeit SDP is strongly consistent, i.e. achieves exact recovery, whereas K-SVD is weakly consistent, i.e. the fraction of misclassified nodes vanish. Also the error bounds suggest that SDP is more resilient towards outliers, which we also demonstrate with experiments.

artificial intelligence, machine learning, outlier, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Add feedback

Convergence of Gradient EM on Multi-component Mixture of Gaussians

Bowei Yan, Mingzhang Yin, Purnamrita Sarkar

Neural Information Processing SystemsMay-28-2025, 05:03:44 GMT

In this paper, we study convergence properties of the gradient variant of Expectation-Maximization algorithm [11] for Gaussian Mixture Models for arbitrary number of clusters and mixing coefficients. We derive the convergence rate depending on the mixing coefficients, minimum and maximum pairwise distances between the true centers, dimensionality and number of components; and obtain a near-optimal local contraction radius. While there have been some recent notable works that derive local convergence rates for EM in the two symmetric mixture of Gaussians, in the more general case, the derivations need structurally different and non-trivial arguments. We use recent tools from learning theory and empirical processes to achieve our theoretical results.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues

Soumendu Sundar Mukherjee, Purnamrita Sarkar, Y. X. Rachel Wang, Bowei Yan

Neural Information Processing SystemsMay-26-2025, 10:24:12 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, initialization, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Texas > Travis County > Austin (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues

Soumendu Sundar Mukherjee, Purnamrita Sarkar, Y. X. Rachel Wang, Bowei Yan

Neural Information Processing SystemsMar-27-2025, 02:06:40 GMT

Neural Information Processing Systems http://nips.cc/

Add feedback

Convergence of Gradient EM on Multi-component Mixture of Gaussians

Bowei Yan, Mingzhang Yin, Purnamrita Sarkar

Neural Information Processing SystemsOct-4-2024, 08:30:48 GMT

In this paper, we study convergence properties of the gradient variant of Expectation-Maximization algorithm [11] for Gaussian Mixture Models for arbitrary number of clusters and mixing coefficients. We derive the convergence rate depending on the mixing coefficients, minimum and maximum pairwise distances between the true centers, dimensionality and number of components; and obtain a near-optimal local contraction radius. While there have been some recent notable works that derive local convergence rates for EM in the two symmetric mixture of Gaussians, in the more general case, the derivations need structurally different and non-trivial arguments. We use recent tools from learning theory and empirical processes to achieve our theoretical results.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback