Streaming k-means approximation

Ailon, Nir, Jaiswal, Ragesh, Monteleoni, Claire

Dec-31-2009–Neural Information Processing Systems

We provide a clustering algorithm that approximately optimizes the k-means objective, in the one-pass streaming setting. We make no assumptions about the data, and our algorithm is very light-weight in terms of memory, and computation. This setting is applicable to unsupervised learning on massive data sets, or resource-constrained devices. The two main ingredients of our theoretical work are: a derivation of an extremely simple pseudo-approximation batch algorithm for k-means, in which the algorithm is allowed to output more than k centers (based on the recent k-means++"), and a streaming clustering algorithm in which batch clustering algorithms are performed on small inputs (fitting in memory) and combined in a hierarchical manner. Empirical evaluations on real and simulated data reveal the practical utility of our method."

algorithm, approximation algorithm, k-means objective, (16 more...)

Neural Information Processing Systems

Dec-31-2009

Conferences PDF

Add feedback

Country:
- South America > Paraguay
  - Asunción > Asunción (0.04)
- North America > United States
  - Pennsylvania (0.04)
  - Massachusetts > Plymouth County
    - Hanover (0.04)
  - California
    - San Diego County > San Diego (0.04)
    - Orange County > Irvine (0.04)
- Asia > Afghanistan
  - Parwan Province > Charikar (0.05)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found