AITopics | Clustering

Collaborating Authors

Clustering

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Learning Representations for Time Series Clustering

Qianli Ma, Jiawei Zheng, Sen Li, Gary W. Cottrell

Neural Information Processing SystemsOct-2-2025, 04:02:54 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, data mining, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Asia > China > Guangdong Province (0.14)

Genre: Research Report (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Reply to Reviewer # 1

Neural Information Processing SystemsOct-2-2025, 04:02:38 GMT

Q1: What other ways to generate fake sequences may be suitable for this problem? A1: That is a good question. GAN to generate some more difficult fake sequences to further improve the ability of the encoder. Q1: Comparison with other state-of-the-art deep clustering methods which are not designed for time-series. A1: Following your suggestion, we compare our method with two state-of-the-art deep clustering methods (i.e., DEC (Xie et al., Table 1: Comparisons on 36 time series datasets (The No. of datasets is consistent with the one in Table 2 in main text)Dataset DEC(RI) IDEC(RI) DTCR(RI) DTCR(NMI) DTCR(ACC) Dataset DEC(RI) IDEC(RI) DTCR(RI) DTCR(NMI) DTCR(ACC)1 0.5817 0.6210 0.6868(0.0026)

dataset, reviewer, table 1, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

Add feedback

Planar Ultrametrics for Image Segmentation

Julian E. Yarkony, Charless Fowlkes

Neural Information Processing SystemsOct-2-2025, 03:41:52 GMT

We study the problem of hierarchical clustering on planar graphs. We formulate this in terms of finding the closest ultrametric to a specified set of distances and solve it using an LP relaxation that leverages minimum cost perfect matching as a subroutine to efficiently explore the space of planar partitions. We apply our algorithm to the problem of hierarchical image segmentation.

constraint, graph, segmentation, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Orange County > Irvine (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.36)

Add feedback

Deep Learning-Based Approach for Improving Relational Aggregated Search

Soliman, Sara Saad, Younes, Ahmed, Elkabani, Islam, Elsayed, Ashraf

arXiv.org Artificial IntelligenceOct-2-2025

Due to an information explosion on the internet, there is a need for the development of aggregated search systems that can boost the retrieval and management of content in various formats. To further improve the clustering of Arabic text data in aggregated search environments, this research investigates the application of advanced natural language processing techniques, namely stacked autoencoders and AraBERT embeddings. By transcending the limitations of traditional search engines, which are imprecise, not contextually relevant, and not personalized, we offer more enriched, context-aware characterizations of search results, so we used a K-means clustering algorithm to discover distinctive features and relationships in these results, we then used our approach on different Arabic queries to evaluate its effectiveness. Our model illustrates that using stacked autoencoders in representation learning suits clustering tasks and can significantly improve clustering search results. It also demonstrates improved accuracy and relevance of search results.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.00966

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Modeling Market States with Clustering and State Machines

Oliva, Christian, Tinjala, Silviu Gabriel

arXiv.org Artificial IntelligenceOct-2-2025

This work introduces a new framework for modeling financial markets through an interpretable probabilistic state machine. By clustering historical returns based on momentum and risk features across multiple time horizons, we identify distinct market states that capture underlying regimes, such as expansion phase, contraction, crisis, or recovery. From a transition matrix representing the dynamics between these states, we construct a probabilistic state machine that models the temporal evolution of the market. This state machine enables the generation of a custom distribution of returns based on a mixture of Gaussian components weighted by state frequencies. We show that the proposed benchmark significantly outperforms the traditional approach in capturing key statistical properties of asset returns, including skewness and kurtosis, and our experiments across random assets and time periods confirm its robustness.

artificial intelligence, machine learning, state machine, (16 more...)

arXiv.org Artificial Intelligence

2510.00953

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

07a4e20a7bbeeb7a736682b26b16ebe8-Paper.pdf

Neural Information Processing SystemsOct-1-2025, 23:58:29 GMT

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(14 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

Matrix Completion with Noisy Side Information

Kai-Yang Chiang, Cho-Jui Hsieh, Inderjit S. Dhillon

Neural Information Processing SystemsOct-1-2025, 23:51:04 GMT

We study the matrix completion problem with side informatio n. Side information has been considered in several matrix completion applicati ons, and has been empirically shown to be useful in many cases. Recently, resear chers studied the effect of side information for matrix completion from a theoretica lv i e w p o i n t,s h o w i n g that sample complexity can be significantly reduced given co mpletely clean features. However, since in reality most given features are noi sy or only weakly informative, the development of a model to handle a general feature set, and investigation of how much noisy features can help matrix recovery, r emains an important issue. In this paper, we propose a novel model that balances b etween features and observations simultaneously in order to leverage feature i nformation yet be robust to feature noise. Moreover, we study the effect of general fe atures in theory and show that by using our model, the sample complexity can be low er than matrix completion as long as features are sufficiently informative .T h i s r e s u l t p r o v i d e s at h e o r e t i c a li n s i g h ti n t ot h eu s e f u l n e s so fg e n e r a ls i d ei n formation. Finally, we consider synthetic data and two applications -- relationshi pp r e d i c t i o na n ds e m i - supervised clustering -- and show that our model outperforms other methods for matrix completion that use features both in theory and pract ice.

complexity, information, matrix completion, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > United States > California (0.04)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Differentially private subspace clustering

Yining Wang, Yu-Xiang Wang, Aarti Singh

Neural Information Processing SystemsOct-1-2025, 23:43:03 GMT

Subspace clustering is an unsupervised learning problem that aims at grouping data points into multiple "clusters" so that data points in a single cluster lie approximately on a low-dimensional linear subspace. It is originally motivated by 3D motion segmentation in computer vision, but has recently been generically applied to a wide range of statistical machine learning problems, which often involves sensitive datasets about human subjects. This raises a dire concern for data privacy. In this work, we build on the framework of differential privacy and present two provably private subspace clustering algorithms. We demonstrate via both theory and experiments that one of the presented methods enjoys formal privacy and utility guarantees; the other one asymptotically preserves differential privacy while having good performance in practice. Along the course of the proof, we also obtain two new provable guarantees for the agnostic subspace clustering and the graph connectivity problem which might be of independent interests.

algorithm, gibbs sampler, subspace, (13 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Montana (0.04)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Focused Education > Special Education (0.44)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Add feedback

Flattening a Hierarchical Clustering through Active Learning

Fabio Vitale, Anand Rajagopalan, Claudio Gentile

Neural Information Processing SystemsOct-1-2025, 23:31:48 GMT

We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Crowdsourcing Without People: Modelling Clustering Algorithms as Experts

Lorentz, Jordyn E. A., Clark, Katharine M.

arXiv.org Artificial IntelligenceOct-1-2025

This paper introduces mixsemble, an ensemble method that adapts the Dawid-Skene model to aggregate predictions from multiple model-based clustering algorithms. Unlike traditional crowdsourcing, which relies on human labels, the framework models the outputs of clustering algorithms as noisy annotations. Experiments on both simulated and real-world datasets show that, although the mixsemble is not always the single top performer, it consistently approaches the best result and avoids poor outcomes. This robustness makes it a practical alternative when the true data structure is unknown, especially for non-expert users.

algorithm, artificial intelligence, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2509.25395

Country: