Stagewise Learning for Sparse Clustering of Discretely-Valued Data

May-27-2016–arXiv.org Machine Learning

We study the model-based sparse clustering problem for discrete data using a mixture model of product distributions [9, 7]. This model has application in many fields, including computational neurosciences, crowdsourcing and bioinformatics, and is interesting because it differs technically from the problem for continuous data, where the well-known Gaussian mixture model has been applied successfully. A fundamental difficulty is that, in high-dimensional datasets, some features can be noisy, redundant or generally uninformative for clustering, and these can push clustering algorithms toward inappropriate or uninteresting results. If these uninformative or noise data points could be eliminated then, we argue, the results should be much more satisfying. This is precisely our goal: to find an informative set of data points and to use these to drive the clustering.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

May-27-2016

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Industry:
- Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Statistical Learning
    - Clustering (0.86)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found