AITopics | absorption

Collaborating Authors

absorption

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders

Neural Information Processing SystemsJun-18-2026, 14:46:01 GMT

As we increase the number of features in the SAE, hierarchical features tend to split into finer features ("math" may split into "algebra", "geometry", etc.), a phenomenon referred to as feature splitting. However, we show that sparse decomposition and splitting of hierarchical features is not robust. Specifically, we show that seemingly monosemantic features fail to fire where they should, and instead get "absorbed" into their children features. We coin this phenomenon feature absorption, and show that it is caused by optimizing for sparsity in SAEs whenever the underlying features form a hierarchy. We introduce a metric to detect absorption in SAEs, and validate our findings empirically on hundreds of LLM SAEs. Our investigation suggests that varying SAE sizes or sparsity is insufficient to solve this issue. We discuss the implications of feature absorption in SAEs and some potential approaches to solve the fundamental theoretical issues before SAEs can be used for interpreting LLMs robustly and at scale.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders

Neural Information Processing SystemsJun-12-2026, 20:41:28 GMT

artificial intelligence, large language model, natural language, (9 more...)

Neural Information Processing Systems

Genre: Research Report (0.60)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

The First Radio Signal From Comet 3I/Atlas Ends the Debate About Its Nature

WIREDNov-10-2025, 15:17:03 GMT

An observatory detected the first radio signal from the interstellar object 3I/Atlas. An image of the interstellar comet 3I/Atlas, captured by the Hubble telescope on July 21, 2025. More evidence has emerged to support the natural origin of comet 3I/Atlas . After several weeks of conspiracy theories, social media debates, and speculation on popular podcasts such as Joe Rogan's, this interstellar object is still a comet . The most recent confirmation came from an observatory in South Africa that detected the first radio signal from 3I/Atlas.

artificial intelligence, radio signal, social media, (17 more...)

WIRED

Country:

Africa (0.55)
Asia (0.49)
North America > United States (0.48)

Industry:

Media (0.73)
Leisure & Entertainment (0.72)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.49)

Technology:

Information Technology > Artificial Intelligence (0.97)
Information Technology > Communications > Social Media (0.68)

Add feedback

Design and Structural Validation of a Micro-UAV with On-Board Dynamic Route Planning

Ravikumar, Inbazhagan, Sundhar, Ram, Vijayakumar, Narendhiran

arXiv.org Artificial IntelligenceOct-27-2025

Micro aerial vehicles are becoming increasingly important in search and rescue operations due to their agility, speed, and ability to access confined spaces o r hazardous areas. However, designing lightweight aerial systems presents significant structural, aerodynamic, and computational challenges. This work addresses two key limitations in many low - cost aerial systems under two kilograms: their lack of structural durability during flight through rough terrains and inability to replan paths dynamically when new victims or obstacles are detected. We present a fully customised drone built from scratch using only commonly available components and materials, emphasising modularity, low cost, and ease of assembly. The structural frame is reinforced with lightweight yet durable materials to withstand impact, while the onboard control system is powered entirely by free, open - source software solutions. The proposed system demonstrates real - time perception and adaptive navigation capabilities without relying on expensive hardware accelerators by offering an affordable and practical solution for real - world search and rescue missions.

artificial intelligence, design and structural validation, path planning, (14 more...)

arXiv.org Artificial Intelligence

2510.21648

Country: North America > Canada (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology (0.71)
Aerospace & Defense (0.70)
Transportation (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.99)
Information Technology > Software (0.88)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.71)

Add feedback

Feature Hedging: Correlated Features Break Narrow Sparse Autoencoders

Chanin, David, Dulka, Tomáš, Garriga-Alonso, Adrià

arXiv.org Artificial IntelligenceSep-29-2025

It is assumed that sparse autoencoders (SAEs) decompose polysemantic activations into interpretable linear directions, as long as the activations are composed of sparse linear combinations of underlying features. However, we find that if an SAE is more narrow than the number of underlying "true features" on which it is trained, and there is correlation between features, the SAE will merge components of correlated features together, thus destroying monosemanticity. In LLM SAEs, these two conditions are almost certainly true. This phenomenon, which we call feature hedging, is caused by SAE reconstruction loss, and is more severe the narrower the SAE. In this work, we introduce the problem of feature hedging and study it both theoretically in toy models and empirically in SAEs trained on LLMs. We suspect that feature hedging may be one of the core reasons that SAEs consistently underperform supervised baselines. Finally, we use our understanding of feature hedging to propose an improved variant of matryoshka SAEs. Importantly, our work shows that SAE width is not a neutral hyperparameter: narrower SAEs suffer more from hedging than wider SAEs. As large language models (LLMs) are deployed in real-world applications, it is increasingly important to understand their internal workings. SAEs have the advantage of operating completely unsupervised, and can easily be scaled to millions of neurons in its hidden layer (hereafter called "latents" While SAEs showed promising results, recent work has cast doubt on the performance of SAEs relative to baseline techniques. Wu et al. (2025) show that SAEs underperform on both concept steering and detection relative to baselines, and Kantamneni et al. (2025) show that SAEs underperform simple linear probes on both in-domain and out-of-domain detection, even when the probes have very few training samples. The question, then, is why do SAEs underperform relative to other techniques? And if we can identify the problems holding back SAEs, can we then fix those problems?

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.11756

Country:

North America > United States (0.28)
Asia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability

Karvonen, Adam, Rager, Can, Lin, Johnny, Tigges, Curt, Bloom, Joseph, Chanin, David, Lau, Yeu-Tong, Farrell, Eoin, McDougall, Callum, Ayonrinde, Kola, Till, Demian, Wearden, Matthew, Conmy, Arthur, Marks, Samuel, Nanda, Neel

arXiv.org Artificial IntelligenceJun-5-2025

Sparse autoencoders (SAEs) are a popular technique for interpreting language model activations, and there is extensive recent work on improving SAE effectiveness. However, most prior work evaluates progress using unsupervised proxy metrics with unclear practical relevance. We introduce SAEBench, a comprehensive evaluation suite that measures SAE performance across eight diverse metrics, spanning interpretability, feature disentanglement and practical applications like unlearning. To enable systematic comparison, we open-source a suite of over 200 SAEs across eight recently proposed SAE architectures and training algorithms. Our evaluation reveals that gains on proxy metrics do not reliably translate to better practical performance. For instance, while Matryoshka SAEs slightly underperform on existing proxy metrics, they substantially outperform other architectures on feature disentanglement metrics; moreover, this advantage grows with SAE scale. By providing a standardized framework for measuring progress in SAE development, SAEBench enables researchers to study scaling trends and make nuanced comparisons between different SAE architectures and training methodologies. Our interactive interface enables researchers to flexibly visualize relationships between metrics across hundreds of open-source SAEs at: www.neuronpedia.org/sae-bench

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.09532

Country:

Asia > Japan (0.04)
North America > Canada (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Fusion of Various Optimization Based Feature Smoothing Methods for Wearable and Non-invasive Blood Glucose Estimation

Wei, Yiting, Ling, Bingo Wing-Kuen, Chen, Danni, Dai, Yuheng, Liu, Qing

arXiv.org Artificial IntelligenceMar-6-2025

Recently, the wearable and non-invasive blood glucose estimation approach has been proposed. However, due to the unreliability of the acquisition device, the presence of the noise and the variations of the acquisition environments, the obtained features and the reference blood glucose values are highly unreliable. To address this issue, this paper proposes a polynomial fitting approach to smooth the obtained features or the reference blood glucose values. First, the blood glucose values are estimated based on the individual optimization approaches. Second, the absolute difference values between the estimated blood glucose values and the actual blood glucose values based on each optimization approach are computed. Third, these absolute difference values for each optimization approach are sorted in the ascending order. Fourth, for each sorted blood glucose value, the optimization method corresponding to the minimum absolute difference value is selected. Fifth, the accumulate probability of each selected optimization method is computed. If the accumulate probability of any selected optimization method at a point is greater than a threshold value, then the accumulate probabilities of these three selected optimization methods at that point are reset to zero. A range of the sorted blood glucose values are defined as that with the corresponding boundaries points being the previous reset point and this reset point. Hence, after performing the above procedures for all the sorted reference blood glucose values in the validation set, the regions of the sorted reference blood glucose values and the corresponding optimization methods in these regions are determined. The computer numerical simulation results show that our proposed method yields the mean absolute relative deviation (MARD) at 0.0930 and the percentage of the test data falling in the zone A of the Clarke error grid at 94.1176%.

blood glucose value, ppg, reference blood glucose value, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1049/syb2.12063

2503.0377

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Transcoders Beat Sparse Autoencoders for Interpretability

Paulo, Gonçalo, Shabalin, Stepan, Belrose, Nora

arXiv.org Artificial IntelligenceFeb-12-2025

Sparse autoencoders (SAEs) extract human-interpretable features from deep neural networks by transforming their activations into a sparse, higher dimensional latent space, and then reconstructing the activations from these latents. Transcoders are similar to SAEs, but they are trained to reconstruct the output of a component of a deep network given its input. In this work, we compare the features found by transcoders and SAEs trained on the same model and data, finding that transcoder features are significantly more interpretable. We also propose skip transcoders, which add an affine skip connection to the transcoder architecture, and show that these achieve lower reconstruction loss with no effect on interpretability.

artificial intelligence, machine learning, transcoder, (15 more...)

arXiv.org Artificial Intelligence

2501.18823

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Deep Neural Network for Phonon-Assisted Optical Spectra in Semiconductors

Gu, Qiangqiang, Pandey, Shishir Kumar

arXiv.org Artificial IntelligenceFeb-2-2025

Phonon-assisted optical absorption in semiconductors is crucial for understanding and optimizing optoelectronic devices, yet its accurate simulation remains a significant challenge in computational materials science. We present an efficient approach that combines deep learning tight-binding (TB) and potential models to efficiently calculate the phonon-assisted optical absorption in semiconductors with $ab$ $initio$ accuracy. Our strategy enables efficient sampling of atomic configurations through molecular dynamics and rapid computation of electronic structure and optical properties from the TB models. We demonstrate its efficacy by calculating the temperature-dependent optical absorption spectra and band gap renormalization of Si and GaAs due to electron-phonon coupling over a temperature range of 100-400 K. Our results show excellent agreement with experimental data, capturing both indirect and direct absorption processes, including subtle features like the Urbach tail. This approach offers a powerful tool for studying complex materials with high accuracy and efficiency, paving the way for high-throughput screening of optoelectronic materials.

artificial intelligence, calculation, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2502.00798

Country:

Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Insights into Lunar Mineralogy: An Unsupervised Approach for Clustering of the Moon Mineral Mapper (M3) spectral data

Thoresen, Freja, Drozdovskiy, Igor, Cowley, Aidan, Laban, Magdelena, Besse, Sebastien, Blunier, Sylvain

arXiv.org Artificial IntelligenceNov-5-2024

This paper presents a novel method for mapping spectral features of the Moon using machine learning-based clustering of hyperspectral data from the Moon Mineral Mapper (M3) imaging spectrometer. The method uses a convolutional variational autoencoder to reduce the dimensionality of the spectral data and extract features of the spectra. Then, a k-means algorithm is applied to cluster the latent variables into five distinct groups, corresponding to dominant spectral features, which are related to the mineral composition of the Moon's surface. The resulting global spectral cluster map shows the distribution of the five clusters on the Moon, which consist of a mixture of, among others, plagioclase, pyroxene, olivine, and Fe-bearing minerals across the Moon's surface. The clusters are compared to the mineral maps from the Kaguya mission, which showed that the locations of the clusters overlap with the locations of high wt% of minerals such as plagioclase, clinopyroxene, and olivine. The paper demonstrates the usefulness of unbiased unsupervised learning for lunar mineral exploration and provides a comprehensive analysis of lunar mineralogy.

artificial intelligence, machine learning, spectra, (18 more...)

arXiv.org Artificial Intelligence

2411.03186

Country: Europe (1.00)

Genre: Research Report (1.00)

Industry:

Materials > Metals & Mining (0.87)
Energy > Oil & Gas > Upstream (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback