AITopics | lim

A Novel Theoretical Analysis for Clustering Heteroscedastic Gaussian Data without Knowledge of the Number of Clusters

Pastor, Dominique, Dupraz, Elsa, Hbilou, Ismail, Ansel, Guillaume

arXiv.org Machine LearningApr-3-2026

This paper addresses the problem of clustering measurement vectors that are heteroscedastic in that they can have different covariance matrices. From the assumption that the measurement vectors within a given cluster are Gaussian distributed with possibly different and unknown covariant matrices around the cluster centroid, we introduce a novel cost function to estimate the centroids. The zeros of the gradient of this cost function turn out to be the fixed-points of a certain function. As such, the approach generalizes the methodology employed to derive the existing Mean-Shift algorithm. But as a main and novel theoretical result compared to Mean-Shift, this paper shows that the sole fixed-points of the identified function tend to be the cluster centroids if both the number of measurements per cluster and the distances between centroids are large enough. As a second contribution, this paper introduces the Wald kernel for clustering. This kernel is defined as the p-value of the Wald hypothesis test for testing the mean of a Gaussian. As such, the Wald kernel measures the plausibility that a measurement vector belongs to a given cluster and it scales better with the dimension of the measurement vectors than the usual Gaussian kernel. Finally, the proposed theoretical framework allows us to derive a new clustering algorithm called CENTRE-X that works by estimating the fixed-points of the identified function. As Mean-Shift, CENTRE-X requires no prior knowledge of the number of clusters. It relies on a Wald hypothesis test to significantly reduce the number of fixed points to calculate compared to the Mean-Shift algorithm, thus resulting in a clear gain in complexity. Simulation results on synthetic and real data sets show that CENTRE-X has comparable or better performance than standard clustering algorithms K-means and Mean-Shift, even when the covariance matrices are not perfectly known.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2604.01943

Country:

Europe > France > Brittany > Finistère > Brest (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

7143_score_based_generative_models_

윤은비

Neural Information Processing SystemsFeb-15-2026, 13:01:49 GMT

diffusion model, score-based generative model, vy process, (13 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.43)

Add feedback

ConstrainedOptimizationtoTrainNeuralNetworks onCriticaland Under-RepresentedClasses

Neural Information Processing SystemsFeb-11-2026, 08:37:25 GMT

Asaconsequence, removing theerror P would reduce theloss more than removing the error N. Moreover, it is clear that this difference in error weighing increases withthelevelofimbalance between theclasses.

almm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Constraining Geometric

Neural Information Processing SystemsFeb-9-2026, 00:59:23 GMT

Figure 3: Reconstruction (top) anddivergence (bottom) losscomparisonforJSG (left) andJSG (right) against KL(q(z|x)kp(z))(VAE) and KL(p(z)kq(z|x))onthe MNISTdataset.

artificial intelligence, jsg, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

A Foundational Theory of Quantitative Abstraction: Adjunctions, Duality, and Logic for Probabilistic Systems

Anwer, Nivar, López-Rubio, Ezequiel, Elizondo, David, Luque-Baena, Rafael M.

arXiv.org Artificial IntelligenceNov-6-2025

The analysis and control of stochastic dynamical systems rely on probabilistic models such as (continuous-space) Markov decision processes, but large or continuous state spaces make exact analysis intractable and call for principled quantitative abstraction. This work develops a unified theory of such abstraction by integrating category theory, coalgebra, quantitative logic, and optimal transport, centred on a canonical $\varepsilon$-quotient of the behavioral pseudo-metric with a universal property: among all abstractions that collapse behavioral differences below $\varepsilon$, it is the most detailed, and every other abstraction achieving the same discounted value-loss guarantee factors uniquely through it. Categorically, a quotient functor $Q_\varepsilon$ from a category of probabilistic systems to a category of metric specifications admits, via the Special Adjoint Functor Theorem, a right adjoint $R_\varepsilon$, yielding an adjunction $Q_\varepsilon \dashv R_\varepsilon$ that formalizes a duality between abstraction and realization; logically, a quantitative modal $μ$-calculus with separate reward and transition modalities is shown, for a broad class of systems, to be expressively complete for the behavioral pseudo-metric, with a countable fully abstract fragment suitable for computation. The theory is developed coalgebraically over Polish spaces and the Giry monad and validated on finite-state models using optimal-transport solvers, with experiments corroborating the predicted contraction properties and structural stability and aligning with the theoretical value-loss bounds, thereby providing a rigorous foundation for quantitative state abstraction and representation learning in probabilistic domains.

abstraction, artificial intelligence, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.19444

Country:

Europe (1.00)
North America > United States > California (0.27)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

7143_score_based_generative_models_

윤은비

Neural Information Processing SystemsOct-8-2025, 23:42:51 GMT

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.43)

Add feedback

678cffc05549fdabda971127602084c6-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 20:14:48 GMT

artificial intelligence, inequality, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.45)

Add feedback

Convex Regression with a Penalty

Lim, Eunji

arXiv.org Machine LearningSep-25-2025

A common way to estimate an unknown convex regression function $f_0: Ω\subset \mathbb{R}^d \rightarrow \mathbb{R}$ from a set of $n$ noisy observations is to fit a convex function that minimizes the sum of squared errors. However, this estimator is known for its tendency to overfit near the boundary of $Ω$, posing significant challenges in real-world applications. In this paper, we introduce a new estimator of $f_0$ that avoids this overfitting by minimizing a penalty on the subgradient while enforcing an upper bound $s_n$ on the sum of squared errors. The key advantage of this method is that $s_n$ can be directly estimated from the data. We establish the uniform almost sure consistency of the proposed estimator and its subgradient over $Ω$ as $n \rightarrow \infty$ and derive convergence rates. The effectiveness of our estimator is illustrated through its application to estimating waiting times in a single-server queue.

problem, regression, subgradient, (15 more...)

arXiv.org Machine Learning

2509.19788

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York > Nassau County > Garden City (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

A Minimum Description Length Approach to Regularization in Neural Networks

Abudy, Matan, Well, Orr, Chemla, Emmanuel, Katzir, Roni, Lan, Nur

arXiv.org Artificial IntelligenceSep-9-2025

State-of-the-art neural networks can be trained to become remarkable solutions to many problems. But while these architectures can express symbolic, perfect solutions, trained models often arrive at approximations instead. We show that the choice of regularization method plays a crucial role: when trained on formal languages with standard regularization ($L_1$, $L_2$, or none), expressive architectures not only fail to converge to correct solutions but are actively pushed away from perfect initializations. In contrast, applying the Minimum Description Length (MDL) principle to balance model complexity with data fit provides a theoretically grounded regularization method. Using MDL, perfect solutions are selected over approximations, independently of the optimization algorithm. We propose that unlike existing regularization techniques, MDL introduces the appropriate inductive bias to effectively counteract overfitting and promote generalization.

artificial intelligence, machine learning, regularization, (17 more...)

arXiv.org Artificial Intelligence

2505.13398

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

On Balancing Sparsity with Reliable Connectivity in Distributed Network Design with Random K-out Graphs

Sood, Mansi, Elumar, Eray Can, Yagan, Osman

arXiv.org Artificial IntelligenceAug-19-2025

In several applications in distributed systems, an important design criterion is ensuring that the network is sparse, i.e., does not contain too many edges, while achieving reliable connectivity. Sparsity ensures communication overhead remains low, while reliable connectivity is tied to reliable communication and inference on decentralized data reservoirs and computational resources. A class of network models called random K-out graphs appear widely as a heuristic to balance connectivity and sparsity, especially in settings with limited trust, e.g., privacy-preserving aggregation of networked data in which networks are deployed. However, several questions remain regarding how to choose network parameters in response to different operational requirements, including the need to go beyond asymptotic results and the ability to model the stochastic and adversarial environments. To address this gap, we present theorems to inform the choice of network parameters that guarantee reliable connectivity in regimes where nodes can be finite or unreliable. We first derive upper and lower bounds for probability of connectivity in random K-out graphs when the number of nodes is finite. Next, we analyze the property of r-robustness, a stronger notion than connectivity that enables resilient consensus in the presence of malicious nodes. Finally, motivated by aggregation mechanisms based on pairwise masking, we model and analyze the impact of a subset of adversarial nodes, modeled as deletions, on connectivity and giant component size - metrics that are closely tied to privacy guarantees. Together, our results pave the way for end-to-end performance guarantees for a suite of algorithms for reliable inference on networks.

artificial intelligence, graph, node, (15 more...)

arXiv.org Artificial Intelligence

2508.11863

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.34)

Industry: