AITopics | coherence

Collaborating Authors

coherence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Feature Starvation as Geometric Instability in Sparse Autoencoders

Chaudhry, Faris, Yano, Keisuke, Monod, Anthea

arXiv.org Machine LearningMay-8-2026

Sparse autoencoders (SAEs) are used to disentangle the dense, polysemantic internal representations of large language models (LLMs) into interpretable, monosemantic concepts. However, standard $\ell_1$-regularized SAEs suffer from feature starvation (dead neurons) and shrinkage bias, often requiring computationally expensive heuristic resampling and nondifferentiable hard-masking methods to bypass these challenges. We argue that feature starvation is not merely an empirical artifact of poor data diversity, but a fundamental optimization-geometric pathology of overcomplete dictionaries: the $\ell_1$-induced sparse coding map is unstable and fundamentally misaligned with shallow, amortized encoders. To address this structural instability, we introduce adaptive elastic net SAEs (AEN-SAEs), a fully differentiable architecture grounded in classical sparse regression. AEN-SAEs combine an $\ell_2$ structural term that enforces strong convexity and Lipschitz stability with adaptive $\ell_1$ reweighting that eliminates shrinkage bias and suppresses spurious features, thereby jointly controlling the curvature and interaction structure of the induced polyhedral geometry. Theoretically, we show that AEN-SAEs yield a Lipschitz-continuous sparse coding map and recover the global feature support under mild assumptions. Empirically, across synthetic settings and LLMs (Pythia 70M, Llama 3.1 8B), AEN-SAEs mitigate feature starvation without auxiliary heuristics while maintaining competitive reconstruction abilities.

feature starvation, large language model, machine learning, (18 more...)

arXiv.org Machine Learning

2605.05341

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

3d36c07721a0a5a96436d6c536a132ec-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 13:16:09 GMT

Figure S1: Estimated Networks 1 & 3 from linear factor models of DS (Top) and Granger causality (Bottom) for simulated data experiment. Each panel shows a grid of DS or Granger causality (GC) features associated with the indicated network estimate. Within each grid, a plot corresponds to signal that is being transmitted from the channel listed on the left to the channel listed at the top. See Figure 1 for a description of the true networks. Each subplot represents the DS from the region listed on the left to the region listed on top. Power spectra are reasonable to model using a linear factor model because they satisfy Definition 1 under reasonable assumptions. We will use Scc(ω) to refer to the spectral power of the signal vc(t) at frequency ω, and vc(ω) to refer to the frequency domain representation of vc(t) at ω.

artificial intelligence, directed spectrum, machine learning, (15 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

2d95270d763751439626d91f57e9a750-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 07:05:48 GMT

artificial intelligence, machine learning, relation, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

2d95270d763751439626d91f57e9a750-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 07:05:45 GMT

artificial intelligence, coherence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.69)
Europe (0.68)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.68)

Add feedback

0f83556a305d789b1d71815e8ea4f4b0-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 17:50:33 GMT

Topic model evaluation, like evaluation of other unsupervised methods, can be contentious. However, the field has coalesced around automated estimates of topic coherence, which rely on the frequency of word co-occurrences in a reference corpus. Contemporary neural topic models surpass classical ones according to these metrics. At the same time, topic model evaluation suffers from a validation gap: automated coherence, developed for classical models, has not been validated using human experimentation for neural models. In addition, a meta-analysis of topic modeling literature reveals a substantial standardization gap in automated topic modeling benchmarks. To address the validation gap, we compare automated coherence with the two most widely accepted human judgment tasks: topic rating and word intrusion. To address the standardization gap, we systematically evaluate a dominant classical model and two state-of-the-art neural models on two commonly used datasets. Automated evaluations declare a winning model when corresponding human evaluations do not, calling into question the validity of fully automatic evaluations independent of human judgments.

artificial intelligence, computational linguistic, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.29)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.81)

Add feedback

Coherence-free Entrywise Estimation of Eigenvectors in Low-rank Signal-plus-noise Matrix Models

Neural Information Processing SystemsMar-22-2026, 18:08:40 GMT

Spectral methods are widely used to estimate eigenvectors of a low-rank signal matrix subject to noise. These methods use the leading eigenspace of an observed matrix to estimate this low-rank signal. Typically, the entrywise estimation error of these methods depends on the coherence of the low-rank signal matrix with respect to the standard basis. In this work, we present a novel method for eigenvector estimation that avoids this dependence on coherence. Assuming a rank-one signal matrix, under mild technical conditions, the entrywise estimation error of our method provably has no dependence on the coherence under Gaussian noise (i.e., in the spiked Wigner model), and achieves the optimal estimation rate up to logarithmic factors. Simulations demonstrate that our method performs well under non-Gaussian noise and that an extension of our method to the case of a rank-$r$ signal matrix has little to no dependence on the coherence. In addition, we derive new metric entropy bounds for rank-$r$ singular subspaces under $\ell_{2,\infty}$ distance, which may be of independent interest. We use these new bounds to improve the best known lower bound for rank-$r$ eigenspace estimation under $\ell_{2,\infty}$ distance.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback