AITopics | dictionary element

Each of these concepts have been individually explored in previous studies.

data mining, machine learning, manifold, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > Canada > Quebec > Montreal (0.04)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning Dan Braun Jordan Taylor Nicholas Goldowsky-Dill Lee Sharkey

Neural Information Processing SystemsNov-20-2025, 03:28:33 GMT

Identifying the features learned by neural networks is a core challenge in mechanistic interpretability. Sparse autoencoders (SAEs), which learn a sparse, overcomplete dictionary that reconstructs a network's internal activations, have been used to identify these features. However, SAEs may learn more about the structure of the dataset than the computational structure of the network. There is therefore only indirect reason to believe that the directions found in these dictionaries are functionally important to the network. We propose end-to-end (e2e) sparse dictionary learning, a method for training SAEs that ensures the features learned are functionally important by minimizing the KL divergence between the output distributions of the original model and the model with SAE activations inserted.

activation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.40)
Oceania > Australia > Queensland (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On the relations of LFPs & Neural Spike Trains

David E. Carlson, Jana Schaich Borg, Kafui Dzirasa, Lawrence Carin

Neural Information Processing SystemsOct-11-2025, 00:02:40 GMT

One of the goals of neuroscience is to identify neural networks that correlate with important behaviors, environments, or genotypes. This work proposes a strategy for identifying neural networks characterized by time-and frequency-dependent connectivity patterns, using convolutional dictionary learning that links spike-train data to local field potentials (LFPs) across multiple areas of the brain. Analytical contributions are: ( i) modeling dynamic relationships between LFPs and spikes; ( ii) describing the relationships between spikes and LFPs, by analyzing the ability to predict LFP data from one region based on spiking information from across the brain; and ( iii) development of a clustering methodology that allows inference of similarities in neurons from multiple regions. Results are based on data sets in which spike and LFP data are recorded simultaneously from up to 16 brain regions in a mouse.

brain region, lfp, neuron, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning Dan Braun Jordan Taylor Nicholas Goldowsky-Dill Lee Sharkey

Neural Information Processing SystemsOct-10-2025, 15:40:38 GMT

Identifying the features learned by neural networks is a core challenge in mechanistic interpretability. Sparse autoencoders (SAEs), which learn a sparse, overcomplete dictionary that reconstructs a network's internal activations, have been used to identify these features. However, SAEs may learn more about the structure of the dataset than the computational structure of the network. There is therefore only indirect reason to believe that the directions found in these dictionaries are functionally important to the network. We propose end-to-end (e2e) sparse dictionary learning, a method for training SAEs that ensures the features learned are functionally important by minimizing the KL divergence between the output distributions of the original model and the model with SAE activations inserted.

activation, sae, sae local, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.40)
Oceania > Australia > Queensland (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

DB-KSVD: Scalable Alternating Optimization for Disentangling High-Dimensional Embedding Spaces

Valentin, Romeo, Katz, Sydney M., Vanhoucke, Vincent, Kochenderfer, Mykel J.

arXiv.org Artificial IntelligenceMay-27-2025

Dictionary learning has recently emerged as a promising approach for mechanistic interpretability of large transformer models. Disentangling high-dimensional transformer embeddings, however, requires algorithms that scale to high-dimensional data with large sample sizes. Recent work has explored sparse autoencoders (SAEs) for this problem. However, SAEs use a simple linear encoder to solve the sparse encoding subproblem, which is known to be NP-hard. It is therefore interesting to understand whether this structure is sufficient to find good solutions to the dictionary learning problem or if a more sophisticated algorithm could find better solutions. In this work, we propose Double-Batch KSVD (DB-KSVD), a scalable dictionary learning algorithm that adapts the classic KSVD algorithm. DB-KSVD is informed by the rich theoretical foundations of KSVD but scales to datasets with millions of samples and thousands of dimensions. We demonstrate the efficacy of DB-KSVD by disentangling embeddings of the Gemma-2-2B model and evaluating on six metrics from the SAEBench benchmark, where we achieve competitive results when compared to established approaches based on SAEs. By matching SAE performance with an entirely different optimization approach, our results suggest that (i) SAEs do find strong solutions to the dictionary learning problem and (ii) that traditional optimization approaches can be scaled to the required problem sizes, offering a promising avenue for further research. We provide an implementation of DB-KSVD at https://github.com/RomeoV/KSVD.jl.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.18441

Genre:

Research Report > New Finding (0.54)
Research Report > Promising Solution (0.34)

Industry: Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Add feedback

The Kernel Beta Process

Lu Ren, Yingjian Wang, Lawrence Carin, David B. Dunson

Neural Information Processing SystemsFeb-11-2025, 18:23:42 GMT

A new Lévy process prior is proposed for an uncountable collection of covariatedependent feature-learning measures; the model is called the kernel beta process (KBP). Available covariates are handled efficiently via the kernel construction, with covariates assumed observed with each data sample ("customer"), and latent covariates learned for each feature ("dish"). Each customer selects dishes from an infinite buffet, in a manner analogous to the beta process, with the added constraint that a customer first decides probabilistically whether to "consider" a dish, based on the distance in covariate space between the customer and dish. If a customer does consider a particular dish, that dish is then selected probabilistically as in the beta process. The beta process is recovered as a limiting case of the KBP. An efficient Gibbs sampler is developed for computations, and state-of-the-art results are presented for image processing and music analysis tasks.

artificial intelligence, beta process, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.87)

Add feedback

On the relations of LFPs & Neural Spike Trains

David E. Carlson, Jana Schaich Borg, Kafui Dzirasa, Lawrence Carin

Neural Information Processing SystemsFeb-9-2025, 17:50:09 GMT

One of the goals of neuroscience is to identify neural networks that correlate with important behaviors, environments, or genotypes. This work proposes a strategy for identifying neural networks characterized by time-and frequency-dependent connectivity patterns, using convolutional dictionary learning that links spike-train data to local field potentials (LFPs) across multiple areas of the brain. Analytical contributions are: (i) modeling dynamic relationships between LFPs and spikes; (ii) describing the relationships between spikes and LFPs, by analyzing the ability to predict LFP data from one region based on spiking information from across the brain; and (iii) development of a clustering methodology that allows inference of similarities in neurons from multiple regions. Results are based on data sets in which spike and LFP data are recorded simultaneously from up to 16 brain regions in a mouse.

artificial intelligence, machine learning, neuron, (16 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning

Braun, Dan, Taylor, Jordan, Goldowsky-Dill, Nicholas, Sharkey, Lee

arXiv.org Artificial IntelligenceMay-24-2024

Identifying the features learned by neural networks is a core challenge in mechanistic interpretability. Sparse autoencoders (SAEs), which learn a sparse, overcomplete dictionary that reconstructs a network's internal activations, have been used to identify these features. However, SAEs may learn more about the structure of the datatset than the computational structure of the network. There is therefore only indirect reason to believe that the directions found in these dictionaries are functionally important to the network. We propose end-to-end (e2e) sparse dictionary learning, a method for training SAEs that ensures the features learned are functionally important by minimizing the KL divergence between the output distributions of the original model and the model with SAE activations inserted. Compared to standard SAEs, e2e SAEs offer a Pareto improvement: They explain more network performance, require fewer total features, and require fewer simultaneously active features per datapoint, all with no cost to interpretability. We explore geometric and qualitative differences between e2e SAE features and standard SAE features. E2e dictionary learning brings us closer to methods that can explain network behavior concisely and accurately.

activation, sae, sae local, (15 more...)

arXiv.org Artificial Intelligence

2405.12241

Country:

Oceania > Australia > Queensland (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry: Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Not All Language Model Features Are Linear

Engels, Joshua, Liao, Isaac, Michaud, Eric J., Gurnee, Wes, Tegmark, Max

arXiv.org Artificial IntelligenceMay-23-2024

Recent work has proposed the linear representation hypothesis: that language models perform computation by manipulating one-dimensional representations of concepts ("features") in activation space. In contrast, we explore whether some language model representations may be inherently multi-dimensional. We begin by developing a rigorous definition of irreducible multi-dimensional features based on whether they can be decomposed into either independent or non-co-occurring lower-dimensional features. Motivated by these definitions, we design a scalable method that uses sparse autoencoders to automatically find multi-dimensional features in GPT-2 and Mistral 7B. These auto-discovered features include strikingly interpretable examples, e.g. circular features representing days of the week and months of the year. We identify tasks where these exact circles are used to solve computational problems involving modular arithmetic in days of the week and months of the year. Finally, we provide evidence that these circular features are indeed the fundamental unit of computation in these tasks with intervention experiments on Mistral 7B and Llama 3 8B, and we find further circular representations by breaking down the hidden states for these tasks into interpretable components.

arxiv preprint arxiv, dictionary element, representation, (14 more...)

arXiv.org Artificial Intelligence

2405.1486

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
(2 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Analysis of Multi-Channel Neural Spike Data

Neural Information Processing SystemsMar-14-2024, 21:48:45 GMT

Nonparametric Bayesian methods are developed for analysis of multi-channel spike-train data, with the feature learning and spike sorting performed jointly. The feature learning and sorting are performed simultaneously across all channels. Dictionary learning is implemented via the beta-Bernoulli process, with spike sorting performed via the dynamic hierarchical Dirichlet process (dHDP), with these two models coupled. The dHDP is augmented to eliminate refractoryperiod violations, it allows the "appearance" and "disappearance" of neurons over time, and it models smooth variation in the spike statistics.

dictionary element, neuron, spike, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Add feedback

Filters

Collaborating Authors

dictionary element

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

The Sparse Manifold Transform

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning Dan Braun Jordan Taylor Nicholas Goldowsky-Dill Lee Sharkey

On the relations of LFPs & Neural Spike Trains

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning Dan Braun Jordan Taylor Nicholas Goldowsky-Dill Lee Sharkey

DB-KSVD: Scalable Alternating Optimization for Disentangling High-Dimensional Embedding Spaces

The Kernel Beta Process

On the relations of LFPs & Neural Spike Trains

Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning

Not All Language Model Features Are Linear

On the Analysis of Multi-Channel Neural Spike Data