AITopics | degeneracy

Collaborating Authors

degeneracy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

fc2e6a440b94f64831840137698021e1-Supplemental.pdf

Neural Information Processing SystemsFeb-12-2026, 00:57:47 GMT

ja 0, jb 0, relaxation, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California > Orange County > Irvine (0.14)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

7cc532d783a7461f227a5da8ea80bfe1-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 12:26:49 GMT

critical point, loss landscape, neural network, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
Europe > France (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Transcendental Regularization of Finite Mixtures:Theoretical Guarantees and Practical Limitations

Fokoué, Ernest

arXiv.org Machine LearningFeb-5-2026

Finite mixture models are widely used for unsupervised learning, but maximum likelihood estimation via EM suffers from degeneracy as components collapse. We introduce transcendental regularization, a penalized likelihood framework with analytic barrier functions that prevent degeneracy while maintaining asymptotic efficiency. The resulting Transcendental Algorithm for Mixtures of Distributions (TAMD) offers strong theoretical guarantees: identifiability, consistency, and robustness. Empirically, TAMD successfully stabilizes estimation and prevents collapse, yet achieves only modest improvements in classification accuracy-highlighting fundamental limits of mixture models for unsupervised learning in high dimensions. Our work provides both a novel theoretical framework and an honest assessment of practical limitations, implemented in an open-source R package.

artificial intelligence, machine learning, transcendental regularization, (18 more...)

arXiv.org Machine Learning

2602.03889

Country: North America > United States > New York > Monroe County > Rochester (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Universal Boosting Variational Inference

Neural Information Processing SystemsDec-24-2025, 23:43:00 GMT

Boosting variational inference (BVI) approximates an intractable probability density by iteratively building up a mixture of simple component distributions one at a time, using techniques from sparse convex optimization to provide both computational scalability and approximation error guarantees. But the guarantees have strong conditions that do not often hold in practice, resulting in degenerate component optimization problems; and we show that the ad-hoc regularization used to prevent degeneracy in practice can cause BVI to fail in unintuitive ways. We thus develop universal boosting variational inference (UBVI), a BVI scheme that exploits the simple geometry of probability densities under the Hellinger metric to prevent the degeneracy of other gradient-based BVI methods, avoid difficult joint optimizations of both component and weight, and simplify fully-corrective weight optimizations. We show that for any target density and any mixture component family, the output of UBVI converges to the best possible approximation in the mixture family, even when the mixture family is misspecified. We develop a scalable implementation based on exponential family mixture components and standard stochastic optimization techniques. Finally, we discuss statistical benefits of the Hellinger distance as a variational objective through bounds on posterior probability, moment, and importance sampling errors. Experiments on multiple datasets and models show that UBVI provides reliable, accurate posterior approximations.

name change, optimization, variational inference, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Globally Convergent Policy Search for Output Estimation

Neural Information Processing SystemsDec-24-2025, 19:08:03 GMT

We introduce the first direct policy search algorithm which provably converges to the globally optimal dynamic filter for the classical problem of predicting the outputs of a linear dynamical system, given noisy, partial observations. Despite the ubiquity of partial observability in practice, theoretical guarantees for direct policy search algorithms, one of the backbones of modern reinforcement learning, have proven difficult to achieve. This is primarily due to the degeneracies which arise when optimizing over filters that maintain an internal state. In this paper, we provide a new perspective on this challenging problem based on the notion of informativity, which intuitively requires that all components of a filter's internal state are representative of the true state of the underlying dynamical system. We show that informativity overcomes the aforementioned degeneracy. Specifically, we propose a regularizer which explicitly enforces informativity, and establish that gradient descent on this regularized objective - combined with a "reconditioning step" - converges to the globally optimal cost at a $O(1/T)$ rate.

globally convergent policy search, name change, output estimation, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Maximum Mean Discrepancy with Unequal Sample Sizes via Generalized U-Statistics

Wei, Aaron, Jalali, Milad, Sutherland, Danica J.

arXiv.org Machine LearningDec-17-2025

Existing two-sample testing techniques, particularly those based on choosing a kernel for the Maximum Mean Discrepancy (MMD), often assume equal sample sizes from the two distributions. Applying these methods in practice can require discarding valuable data, unnecessarily reducing test power. W e address this long-standing limitation by extending the theory of generalized U-statistics and applying it to the usual MMD estimator, resulting in new characterization of the asymptotic distributions of the MMD estimator with unequal sample sizes (particularly outside the proportional regimes required by previous partial results). This generalization also provides a new criterion for optimizing the power of an MMD test with unequal sample sizes. Our approach preserves all available data, enhancing test accuracy and applicability in realistic settings. Along the way, we give much cleaner characterizations of the variance of MMD estimators, revealing something that might be surprising to those in the area: while zero MMD implies a degenerate estimator, it is sometimes possible to have a degenerate estimator with nonzero MMD as well; we give a construction and a proof that it does not happen in common situations.

estimator, kernel, variance, (15 more...)

arXiv.org Machine Learning

2512.13997

Country:

North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > Canada > Quebec (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Parallelizing Tree Search with Twice Sequential Monte Carlo

Oren, Yaniv, de Vries, Joery A., van der Vaart, Pascal R., Spaan, Matthijs T. J., Böhmer, Wendelin

arXiv.org Artificial IntelligenceNov-19-2025

Model-based reinforcement learning (RL) methods that leverage search are responsible for many milestone breakthroughs in RL. Sequential Monte Carlo (SMC) recently emerged as an alternative to the Monte Carlo Tree Search (MCTS) algorithm which drove these breakthroughs. SMC is easier to parallelize and more suitable to GPU acceleration. However, it also suffers from large variance and path degeneracy which prevent it from scaling well with increased search depth, i.e., increased sequential compute. To address these problems, we introduce Twice Sequential Monte Carlo Tree Search (TSMCTS). Across discrete and continuous environments TSMCTS outperforms the SMC baseline as well as a popular modern version of MCTS. Through variance reduction and mitigation of path degeneracy, TSMCTS scales favorably with sequential compute while retaining the properties that make SMC natural to parallelize.

machine learning, particle, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2511.1422

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(2 more...)

Add feedback

Translation Entropy: A Statistical Framework for Evaluating Translation Systems

Gross, Ronit D., Harel, Yanir, Kanter, Ido

arXiv.org Artificial IntelligenceNov-18-2025

The translation of written language has been known since the 3rd century BC; however, its necessity has become increasingly common in the information age. Today, many translators exist, based on encoder-decoder deep architectures, nevertheless, no quantitative objective methods are available to assess their performance, likely because the entropy of even a single language remains unknown. This study presents a quantitative method for estimating translation entropy, with the following key finding. Given a translator, several sentences that differ by only one selected token of a given pivot sentence yield identical translations. Analyzing the statistics of this phenomenon across an ensemble of such sentences, consisting each of a pivot selected token, yields the probabilities of replacing this specific token with others while preserving the translation. These probabilities constitute the entropy of the selected token, and the average across all selected pivot tokens provides an estimate of the translator's overall translation entropy, which is enhanced along the decoder blocks. This entropic measure allows for the quantitative ranking of several publicly available translators and reveals whether mutual translation entropy is symmetric. Extending the proposed method to include the replacement of two tokens in a given pivot sentence demonstrates a multiplicative effect, where translation degeneracy is proportional to the product of the degeneracies of the two tokens. These findings establish translation entropy as a measurable property and objective benchmarking of artificial translators. Results are based on MarianMT, T5-Base and NLLB-200 translators.

machine learning, natural language, translation, (19 more...)

arXiv.org Artificial Intelligence

2511.1318

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Finding separatrices of dynamical flows with Deep Koopman Eigenfunctions

Dabholkar, Kabir V., Barak, Omri

arXiv.org Artificial IntelligenceNov-14-2025

Many natural systems, including neural circuits involved in decision making, are modeled as high-dimensional dynamical systems with multiple stable states. While existing analytical tools primarily describe behavior near stable equilibria, characterizing separatrices--the manifolds that delineate boundaries between different basins of attraction--remains challenging, particularly in high-dimensional settings. Here, we introduce a numerical framework leveraging Koopman Theory combined with Deep Neural Networks to effectively characterize separatrices. Specifically, we approximate Koopman Eigenfunctions (KEFs) associated with real positive eigenvalues, which vanish precisely at the separatrices. Utilizing these scalar KEFs, optimization methods efficiently locate separatrices even in complex systems. We demonstrate our approach on synthetic benchmarks, ecological network models, and high-dimensional recurrent neural networks trained on either neuroscience-inspired tasks or fit to real neural data. Moreover, we illustrate the practical utility of our method by designing optimal perturbations that can shift systems across separatrices, enabling predictions relevant to optogenetic stimulation experiments in neuroscience.

artificial intelligence, machine learning, separatrix, (16 more...)

arXiv.org Artificial Intelligence

2505.15231

Country: