AITopics | Clustering

Collaborating Authors

Clustering

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Bayesian Nonparametric Dynamical Clustering of Time Series

Pérez-Herrero, Adrián, Félix, Paulo, Presedo, Jesús, Ek, Carl Henrik

arXiv.org Machine LearningOct-9-2025

Abstract--We present a method that models the evolution of an unbounded number of time series clusters by switching among an unknown number of regimes with linear dynamics. We develop a Bayesian non-parametric approach using a hierarchical Dirichlet process as a prior on the parameters of a Switching Linear Dynamical System and a Gaussian process prior to model the statistical variations in amplitude and temporal alignment within each cluster . By modeling the evolution of time series patterns, the method avoids unnecessary proliferation of clusters in a principled manner . We perform inference by formulating a variational lower bound for off-line and on-line scenarios, enabling efficient learning through optimization. We illustrate the versatility and effectiveness of the approach through several case studies of electrocardiogram analysis using publicly available databases. Index T erms--Time series analysis, Bayesian methods, Gaussian processes, linear dynamical systems, Dirichlet processes, unsupervised learning, electrocardiogram, arrhythmia detection. IME series data analysis has come to pervade all scientific and technological domains, driven by the need to understand change over time. With the growing availability of such data, machine learning has assumed an increasingly central role in a wide variety of tasks which fall under the category of pattern recognition. Particularly, there is growing interest in identifying similar behaviors in time series data as a preliminary step towards generating insights into the dynamics of the underlying processes. Some recent methodologies can be found for characterizing sea wave conditions [1], transcriptome-wide gene expression profiling [2], selecting stocks with different share price performance [3], and discovering human motion primitives [4].

gaussian process, neural information processing system, sequence, (12 more...)

arXiv.org Machine Learning

2510.06919

Country:

Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Add feedback

Angular Constraint Embedding via SpherePair Loss for Constrained Clustering

Zhang, Shaojie, Chen, Ke

arXiv.org Artificial IntelligenceOct-9-2025

Constrained clustering integrates domain knowledge through pairwise constraints. However, existing deep constrained clustering (DCC) methods are either limited by anchors inherent in end-to-end modeling or struggle with learning discriminative Euclidean embedding, restricting their scalability and real-world applicability. To avoid their respective pitfalls, we propose a novel angular constraint embedding approach for DCC, termed SpherePair. Using the SpherePair loss with a geometric formulation, our method faithfully encodes pairwise constraints and leads to embeddings that are clustering-friendly in angular space, effectively separating representation learning from clustering. SpherePair preserves pairwise relations without conflict, removes the need to specify the exact number of clusters, generalizes to unseen data, enables rapid inference of the number of clusters, and is supported by rigorous theoretical guarantees. Comparative evaluations with state-of-the-art DCC methods on diverse benchmarks, along with empirical validation of theoretical insights, confirm its superior performance, scalability, and overall real-world effectiveness. Code is available at \href{https://github.com/spherepaircc/SpherePairCC/tree/main}{our repository}.

constraint, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.06907

Country:

Europe (0.45)
North America (0.28)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Chem-NMF: Multi-layer $α$-divergence Non-Negative Matrix Factorization for Cardiorespiratory Disease Clustering, with Improved Convergence Inspired by Chemical Catalysts and Rigorous Asymptotic Analysis

Torabi, Yasaman, Shirani, Shahram, Reilly, James P.

arXiv.org Artificial IntelligenceOct-9-2025

Non-Negative Matrix Factorization (NMF) is an unsupervised learning method offering low-rank representations across various domains such as audio processing, biomedical signal analysis, and image recognition. The incorporation of $α$-divergence in NMF formulations enhances flexibility in optimization, yet extending these methods to multi-layer architectures presents challenges in ensuring convergence. To address this, we introduce a novel approach inspired by the Boltzmann probability of the energy barriers in chemical reactions to theoretically perform convergence analysis. We introduce a novel method, called Chem-NMF, with a bounding factor which stabilizes convergence. To our knowledge, this is the first study to apply a physical chemistry perspective to rigorously analyze the convergence behaviour of the NMF algorithm. We start from mathematically proven asymptotic convergence results and then show how they apply to real data. Experimental results demonstrate that the proposed algorithm improves clustering accuracy by 5.6% $\pm$ 2.7% on biomedical signals and 11.1% $\pm$ 7.2% on face images (mean $\pm$ std).

artificial intelligence, machine learning, pattern recognition, (15 more...)

arXiv.org Artificial Intelligence

2510.06632

Country: North America > Canada > Ontario (0.68)

Genre: Research Report > Promising Solution (0.54)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.66)

Add feedback

Autonomy-Aware Clustering: When Local Decisions Supersede Global Prescriptions

Srivastava, Amber, Basiri, Salar, Salapaka, Srinivasa

arXiv.org Artificial IntelligenceOct-9-2025

Clustering arises in a wide range of problem formulations, yet most existing approaches assume that the entities under clustering are passive and strictly conform to their assigned groups. In reality, entities often exhibit local autonomy, overriding prescribed associations in ways not fully captured by feature representations. Such autonomy can substantially reshape clustering outcomes -- altering cluster compositions, geometry, and cardinality -- with significant downstream effects on inference and decision-making. We introduce autonomy-aware clustering, a reinforcement learning (RL) framework that learns and accounts for the influence of local autonomy without requiring prior knowledge of its form. Our approach integrates RL with a Deterministic Annealing (DA) procedure, where, to determine underlying clusters, DA naturally promotes exploration in early stages of annealing and transitions to exploitation later. We also show that the annealing procedure exhibits phase transitions that enable design of efficient annealing schedules. To further enhance adaptability, we propose the Adaptive Distance Estimation Network (ADEN), a transformer-based attention model that learns dependencies between entities and cluster representatives within the RL loop, accommodates variable-sized inputs and outputs, and enables knowledge transfer across diverse problem instances. Empirical results show that our framework closely aligns with underlying data dynamics: even without explicit autonomy models, it achieves solutions close to the ground truth (gap ~3-4%), whereas ignoring autonomy leads to substantially larger gaps (~35-40%). The code and data are publicly available at https://github.com/salar96/AutonomyAwareClustering.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.25775

Country: North America > United States > Illinois (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

82aec8518602748540a42b783468c94d-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 23:59:53 GMT

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Baltimore (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Error Discovery By Clustering Influence Embeddings Fulton Wang

Neural Information Processing SystemsOct-8-2025, 23:59:21 GMT

We formalize coherence --a requirement that erroneous predictions, within a slice, should be wrong for the same reason --as a key property that any slice discovery method should satisfy.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Asia > British Indian Ocean Territory > Diego Garcia (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

Replicable Clustering

Neural Information Processing SystemsOct-8-2025, 23:12:51 GMT

This means that the algorithm designer needs to balance several conflicting desiderata.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

6c5b82193c5d8e6aa5806239676ddc97-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 20:46:38 GMT

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.72)
Information Technology > Data Science (0.71)

Add feedback

Supplementary Material for Semantic Image Synthesis with Unconditional Generator

Neural Information Processing SystemsOct-8-2025, 20:16:20 GMT

This process enables the value (feature maps) to be rearranged (through a weighted sum) to align with the form of the query, thereby reflecting their strong correspondence. The input noise is removed because its stochasticity slows down the training. Given the need for balancing between high correspondence and image quality, we empirically set the weights of our loss terms. To demonstrate the influence of the additional losses introduced in our method, we provide both quantitative and qualitative ablations in Figure S2 and S3, respectively. Nonetheless, caution is warranted when overly increasing the number of clusters.

artificial intelligence, machine learning, proxy mask, (12 more...)

Neural Information Processing Systems

Country: Asia > South Korea > Seoul > Seoul (0.05)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.37)

Add feedback

Filters

Collaborating Authors

Clustering

Bayesian Nonparametric Dynamical Clustering of Time Series

Angular Constraint Embedding via SpherePair Loss for Constrained Clustering

Chem-NMF: Multi-layer $α$-divergence Non-Negative Matrix Factorization for Cardiorespiratory Disease Clustering, with Improved Convergence Inspired by Chemical Catalysts and Rigorous Asymptotic Analysis

Autonomy-Aware Clustering: When Local Decisions Supersede Global Prescriptions

82aec8518602748540a42b783468c94d-Paper-Conference.pdf

Error Discovery By Clustering Influence Embeddings Fulton Wang

Replicable Clustering

6c5b82193c5d8e6aa5806239676ddc97-Paper-Conference.pdf

Supplementary Material for Semantic Image Synthesis with Unconditional Generator

678594bcff6f99f3b7a8ff459989b1a3-Supplemental-Conference.pdf