AITopics | cluster center

Collaborating Authors

cluster center

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Central Description Length (CDL) Clustering Validation Index

Shamsi, Mahdi, Beheshti, Soosan

arXiv.org Machine LearningJun-5-2026

Selecting a clustering algorithm and its hyperparameters without labels is a common difficulty in engineering machine learning pipelines that work with unsupervised analysis of sensor, image, or process data. Clustering validation indices (CVIs) provide internal scores for ranking candidate clusterings, but most popular CVIs are built from Euclidean compactness and separation terms and so tend to favour compact, convex partitions. Their performance is known to degrade on non convex, irregular, or variable density data, where kernel transformations or alternative distance measures are typically used at the cost of additional tuning and computation. This paper introduces the Central Description Length (CDL) clustering validation index. CDL uses the observed within cluster compactness, the estimated cluster centers, and the estimated cluster covariances to compute a probabilistic upper bound on the description length associated with the unobservable true cluster centers. The bound condenses intra cluster compactness and centroid displacement into a single computable quantity and is evaluated on the partition produced by any clustering algorithm. The implementation uses only observable quantities (the data, the partition, the estimated centers, and the estimated covariances) and does not use ground truth labels. On synthetic benchmarks with non convex and arbitrary shape clusters, CDL-CVI selected the reference number of clusters more often and reached higher Adjusted Rand Index (ARI) values than the conventional CVIs we tested, without an additional kernel preprocessing stage. On image benchmarks (MNIST, CIFAR-10, STL-10) clustered from frozen unsupervised embeddings, CDL-CVI returned cluster numbers close to the reference class counts across K-means, DBSCAN, and spectral clustering in the reported trials. We also discuss limitations of the approach, in particular its dependence on covariance estimation, the chosen distance metric, and the input representation. 1 Introduction Many engineering machine learning pipelines rely on the clustering of unlabeled measurements: fault diagnosis from vibration and acoustic signals, sensor state discovery in industrial processes, condition monitoring of mechanical and electrical systems, materials characterization, segmentation of images and signals, and exploratory grouping of process variables.

artificial intelligence, hyperparameter, machine learning, (17 more...)

arXiv.org Machine Learning

2606.0523

Country: North America > Canada > Ontario (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

End-to-end Learnable Clustering for Intent Learning in Recommendation

Neural Information Processing SystemsMay-1-2026, 01:32:55 GMT

Intent learning, which aims to learn users' intents for user understanding and item recommendation, has become a hot research spot in recent years. However, existing methods suffer from complex and cumbersome alternating optimization, limiting performance and scalability. To this end, we propose a novel intent learning method termed \underline{ELCRec}, by unifying behavior representation learning into an \underline{E}nd-to-end \underline{L}earnable \underline{C}lustering framework, for effective and efficient \underline{Rec}ommendation.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.57)

Add feedback

ATheory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning (Supplementary File)

Neural Information Processing SystemsApr-25-2026, 09:01:34 GMT

This supplementary document contains more additional experimental details and the technical proofs of convergence results of the NeurIPS'21 submission entitled "ATheory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning". It is structured as follows. In Appendix A, we provides more experimental details, including training algorithm, network architecture, optimizer details, loss construction and training cost of SANE. Appendix B presents the proof and details of the main results, namely, Theorem 1, in Section 2, which analyzes the generalization performance of MoCo. Next, Appendix C introduces the proof roadmap and details of the main results, i.e.

artificial intelligence, augmentation, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Fast and Provably Good Seedings for k-Means

Olivier Bachem, Mario Lucic, Hamed Hassani, Andreas Krause

Neural Information Processing SystemsApr-22-2026, 08:41:17 GMT

Seeding - the task of finding initial cluster centers - is critical in obtaining highquality clusterings for k-Means. However, k-means++ seeding, the state of the art algorithm, does not scale well to massive datasets as it is inherently sequential and requires k full passes through the data. It was recently shown that Markov chain Monte Carlo sampling can be used to efficiently approximate the seeding step of k-means++. However, this result requires assumptions on the data generating distribution. We propose a simple yet fast seeding algorithm that produces provably good clusterings even without assumptions on the data. Our analysis shows that the algorithm allows for a favourable trade-off between solution quality and computational cost, speeding up k-means++seeding by up to several orders of magnitude.

artificial intelligence, assumption -freek -mc2, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Distributionally Robust K-Means Clustering

Malik, Vikrant, Kargin, Taylan, Hassibi, Babak

arXiv.org Machine LearningApr-14-2026

In recent years, the widespreadavailability of large-scale, high-dimensionaldatasets has driven significant interest in clustering algorithms that are both computationally efficient and robust to distributional shifts and outliers. The classical clustering method, K-means, can be seen as an application of the Lloyd-Max quantization algorithm, in which the distribution being quantized is the empirical distribution of the points to be clustered. This empirical distribution generally differs from the true underlying distribution, especially when the number of points to be clustered is small. This induces a distributional shift, which can also arise in many real-world settings, such as image segmentation, biological data analysis, and sensor networks, due to noise variations, sensor inaccuracies, or environmental changes. Distributional shifts can severely impact the performance of clustering algorithms, leading to degraded cluster assignments and unreliable downstream analysis. The field of clustering has a rich history. One of the most popular algorithms in this field is theK-means (KM) algorithm, introduced by [1], which computes centroids by iteratively updating the conditional mean of the data in the Voronoi regions induced by the centroids. However, standardK-means is sensitive to initialization and, in general, converges only to a local minimum.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2604.11118

Country:

North America > United States > California > Los Angeles County > Pasadena (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Oakland (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Learning Deep Parsimonious Representations

Renjie Liao, Alex Schwing, Richard Zemel, Raquel Urtasun

Neural Information Processing SystemsMar-23-2026, 20:51:45 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

CNNpack: Packing Convolutional Neural Networks in the Frequency Domain

Yunhe Wang, Chang Xu, Shan You, Dacheng Tao, Chao Xu

Neural Information Processing SystemsMar-23-2026, 06:19:29 GMT

Deep convolutional neural networks (CNNs) are successfully used in a number of applications. However, their storage and computational requirements have largely prevented their widespread use on mobile devices. Here we present an effective CNN compression approach in the frequency domain, which focuses not only on smaller weights but on all the weights and their underlying connections. By treating convolutional filters as images, we decompose their representations in the frequency domain as common parts (i.e., cluster centers) shared by other similar filters and their individual private parts (i.e., individual residuals). A large number of low-energy frequency coefficients in both parts can be discarded to produce high compression without significantly compromising accuracy. We relax the computational burden of convolution operations in CNNs by linearly combining the convolution responses of discrete cosine transform (DCT) bases. The compression and speed-up ratios of the proposed algorithm are thoroughly analyzed and evaluated on benchmark image datasets to demonstrate its superiority over state-of-the-art methods.

artificial intelligence, frequency domain, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Constant-Factor Bi-Criteria Approximation Guarantee for k-means++

Dennis Wei

Neural Information Processing SystemsMar-23-2026, 06:19:11 GMT

This paper studies the k-means++ algorithm for clustering as well as the class of D` sampling algorithms to which k-means++ belongs. It is shown that for any constant factor β > 1, selecting βk cluster centers by D` sampling yields a constant-factor approximation to the optimal clustering with k centers, in expectation and without conditions on the dataset. This result extends the previously known O(log k) guarantee for the case β = 1 to the constant-factor bi-criteria regime. It also improves upon an existing constant-factor bi-criteria result that holds only with constant probability.

artificial intelligence, lemma 3, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

Learning Commonality, Divergence and Variety for Unsupervised Visible-Infrared Person Re-identification

Neural Information Processing SystemsMar-22-2026, 04:04:52 GMT

Unsupervised visible-infrared person re-identification (USVI-ReID) aims to match specified persons in infrared images to visible images without annotations, and vice versa. USVI-ReID is a challenging yet underexplored task. Most existing methods address the USVI-ReID through cluster-based contrastive learning, which simply employs the cluster center to represent an individual. However, the cluster center primarily focuses on commonality, overlooking divergence and variety. To address the problem, we propose a Progressive Contrastive Learning with Hard and Dynamic Prototypes for USVI-ReID.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.66)
Information Technology > Artificial Intelligence > Machine Learning (0.63)

Add feedback

SuperVLAD: Compact and Robust Image Descriptors for Visual Place Recognition

Neural Information Processing SystemsMar-17-2026, 23:12:08 GMT

Visual place recognition (VPR) is an essential task for multiple applications such as augmented reality and robot localization. Over the past decade, mainstream methods in the VPR area have been to use feature representation based on global aggregation, as exemplified by NetVLAD. These features are suitable for large-scale VPR and robust against viewpoint changes. However, the VLAD-based aggregation methods usually learn a large number of (e.g., 64) clusters and their corresponding cluster centers, which directly leads to a high dimension of the yielded global features. More importantly, when there is a domain gap between the data in training and inference, the cluster centers determined on the training set are usually improper for inference, resulting in a performance drop.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.55)

Add feedback