AITopics | efficient clustering

Collaborating Authors

efficient clustering

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Clustering for Stretched Mixtures: Landscape and Optimality

Neural Information Processing SystemsDec-24-2025, 21:28:01 GMT

This paper considers a canonical clustering problem where one receives unlabeled samples drawn from a balanced mixture of two elliptical distributions and aims for a classifier to estimate the labels. Many popular methods including PCA and k-means require individual components of the mixture to be somewhat spherical, and perform poorly when they are stretched. To overcome this issue, we propose a non-convex program seeking for an affine transform to turn the data into a one-dimensional point cloud concentrating around -1 and 1, after which clustering becomes easy. Our theoretical contributions are two-fold: (1) we show that the non-convex loss function exhibits desirable geometric properties when the sample size exceeds some constant multiple of the dimension, and (2) we leverage this to prove that an efficient first-order algorithm achieves near-optimal statistical precision without good initialization. We also propose a general methodology for clustering with flexible choices of feature transforms and loss objectives.

efficient clustering, name change, stretched mixture, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

Add feedback

Efficient Clustering Based On A Unified View Of K -means And Ratio-cut

Neural Information Processing SystemsDec-24-2025, 10:31:14 GMT

Spectral clustering and $k$-means, both as two major traditional clustering methods, are still attracting a lot of attention, although a variety of novel clustering algorithms have been proposed in recent years. Firstly, a unified framework of $k$-means and ratio-cut is revisited, and a novel and efficient clustering algorithm is then proposed based on this framework. The time and space complexity of our method are both linear with respect to the number of samples, and are independent of the number of clusters to construct, more importantly. These properties mean that it is easily scalable and applicable to large practical problems. Extensive experiments on 12 real-world benchmark and 8 facial datasets validate the advantages of the proposed algorithms compared to the state-of-the-art clustering algorithms. In particular, over 15x and 7x speed-up can be obtained with respect to $k$-means on the synthetic dataset of 1 million samples and the benchmark dataset (CelebA) of 200k samples, respectively [GitHub].

efficient clustering, name change, unified view, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Efficient Clustering for Stretched Mixtures: Landscape and Optimality

Neural Information Processing SystemsAug-22-2025, 01:08:05 GMT

Suppose that we observe i.i.d.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Review for NeurIPS paper: Efficient Clustering Based On A Unified View Of K-means And Ratio-cut

Neural Information Processing SystemsJan-27-2025, 11:11:31 GMT

Additional Feedback: EDIT: I am satisfied by the response of the reviewers that they will address the issues of clarity, after which I believe the paper represents a valuable contribution. I commend the authors for what appears to be an innovative algorithm with extremely good practical performance. I believe the paper could be a very influential one, but I feel the presentation of the work needs to be modified and improved. I think there are a few too many concessions which are made. For example, you begin with ratio cut, then change to normalised cut when you assert that the affinity matrix is made doubly stochastic.

efficient clustering, k-means and ratio-cut, matrix, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.51)

Add feedback

Review for NeurIPS paper: Efficient Clustering Based On A Unified View Of K-means And Ratio-cut

Neural Information Processing SystemsJan-27-2025, 11:11:24 GMT

The authors present an efficient algorithm for the sum-of-square objective. The proposed method has very impressive experimental performances and could be of interest for a broad audience. The paper contains a number of typos that should be fixed before publication.

efficient clustering, k-means and ratio-cut, neurips paper, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Efficient Clustering for Stretched Mixtures: Landscape and Optimality

Neural Information Processing SystemsJan-15-2025, 16:35:13 GMT

efficient clustering, landscape and optimality, stretched mixture

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.65)

Add feedback

Efficient Clustering Based On A Unified View Of K -means And Ratio-cut

Neural Information Processing SystemsOct-11-2024, 01:43:48 GMT

Spectral clustering and k -means, both as two major traditional clustering methods, are still attracting a lot of attention, although a variety of novel clustering algorithms have been proposed in recent years. Firstly, a unified framework of k -means and ratio-cut is revisited, and a novel and efficient clustering algorithm is then proposed based on this framework. The time and space complexity of our method are both linear with respect to the number of samples, and are independent of the number of clusters to construct, more importantly. These properties mean that it is easily scalable and applicable to large practical problems. Extensive experiments on 12 real-world benchmark and 8 facial datasets validate the advantages of the proposed algorithms compared to the state-of-the-art clustering algorithms.

algorithm, efficient clustering, unified view, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Efficient Clustering from Distributions over Topics

Badenes-Olmedo, Carlos, García, Jose-Luis Redondo, Corcho, Oscar

arXiv.org Artificial IntelligenceDec-15-2020

There are many scenarios where we may want to find pairs of textually similar documents in a large corpus (e.g. a researcher doing literature review, or an R&D project manager analyzing project proposals). To programmatically discover those connections can help experts to achieve those goals, but brute-force pairwise comparisons are not computationally adequate when the size of the document corpus is too large. Some algorithms in the literature divide the search space into regions containing potentially similar documents, which are later processed separately from the rest in order to reduce the number of pairs compared. However, this kind of unsupervised methods still incur in high temporal costs. In this paper, we present an approach that relies on the results of a topic modeling algorithm over the documents in a collection, as a means to identify smaller subsets of documents where the similarity function can then be computed. This approach has proved to obtain promising results when identifying similar documents in the domain of scientific publications. We have compared our approach against state of the art clustering techniques and with different configurations for the topic modeling algorithm. Results suggest that our approach outperforms (> 0.5) the other analyzed techniques in terms of efficiency.

algorithm, dirichlet distribution, topic distribution, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3148011.3148019

2012.08206

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Add feedback