Goto

Collaborating Authors

 spg


Mixed Membership sub-Gaussian Models

arXiv.org Machine Learning

The Gaussian mixture model is widely used in unsupervised learning, owing to its simplicity and interpretability. However, a fundamental limitation of the classical Gaussian mixture model is that it forces each observation to belong to exactly one component. In many practical applications, such as genetics, social network analysis, and text mining, an observation may naturally belong to multiple components or exhibit partial membership in several latent components. To overcome this limitation, we propose the mixed membership sub-Gaussian model, which extends the classical Gaussian mixture framework by allowing each observation to belong to multiple components. This model inherits the interpretability of the classical Gaussian mixture model while offering greater flexibility for capturing complex overlapping structures. We develop an efficient spectral algorithm to estimate the mixed membership of each individual observation, and under mild separation conditions on the component centres, we prove that the estimation error of the per-individual membership vector can be made arbitrarily small with high probability. To our knowledge, this is the first work to provide a computationally efficient estimator with such a vanishing-error guarantee for a mixed-membership extension of the Gaussian mixture model. Extensive experimental studies demonstrate that our method outperforms existing approaches that ignore mixed memberships.









follows

Neural Information Processing Systems

Firstly, we thank the reviewers for their valuable comments. Whilst it is not reasonable in practice to assume that data is sampled i.i.d. As previously stated, we believe our work forms a first step in achieving this goal. We believe that our theoretical model captures this dynamic. An insurance company may gather information from a customer to better evaluate potential risk.