AITopics | fast weight programmer

Collaborating Authors

fast weight programmer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

fc09b26b85ab3abb2832bd555a2e4215-Paper-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 01:57:00 GMT

differential equation, neural information processing system, neural network, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(16 more...)

Genre: Research Report (0.93)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers

Neural Information Processing SystemsDec-24-2025, 00:48:53 GMT

Transformers with linearised attention (''linear Transformers'') have demonstrated the practical scalability and effectiveness of outer product-based Fast Weight Programmers (FWPs) from the '90s. However, the original FWP formulation is more general than the one of linear Transformers: a slow neural network (NN) continually reprograms the weights of a fast NN with arbitrary architecture. In existing linear Transformers, both NNs are feedforward and consist of a single layer. Here we explore new variations by adding recurrence to the slow and fast nets. We evaluate our novel recurrent FWPs (RFWPs) on two synthetic algorithmic tasks (code execution and sequential ListOps), Wikitext-103 language models, and on the Atari 2600 2D game environment.

fast weight programmer, linear transformer, transformer, (5 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules

Neural Information Processing SystemsAug-19-2025, 21:52:41 GMT

This yields continuous-time counterparts of Fast Weight Programmers and linear Transformers.

artificial intelligence, machine learning, neural information processing system, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(16 more...)

Genre: Research Report (0.93)

Industry:

Health & Medicine (0.69)
Education > Educational Setting > Continuing Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers

Neural Information Processing SystemsOct-10-2024, 03:21:07 GMT

fast weight programmer, recurrent fast weight programmer, transformer, (1 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)

Add feedback

Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules

Irie, Kazuki, Faccio, Francesco, Schmidhuber, Jürgen

arXiv.org Artificial IntelligenceOct-14-2022

Neural ordinary differential equations (ODEs) have attracted much attention as continuous-time counterparts of deep residual neural networks (NNs), and numerous extensions for recurrent NNs have been proposed. Since the 1980s, ODEs have also been used to derive theoretical results for NN learning rules, e.g., the famous connection between Oja's rule and principal component analysis. Such rules are typically expressed as additive iterative update processes which have straightforward ODE counterparts. Here we introduce a novel combination of learning rules and Neural ODEs to build continuous-time sequence processing nets that learn to manipulate short-term memory in rapidly changing synaptic connections of other nets. This yields continuous-time counterparts of Fast Weight Programmers and linear Transformers. Our novel models outperform the best existing Neural Controlled Differential Equation based models on various time series classification tasks, while also addressing their fundamental scalability limitations.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2206.01649

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(18 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (0.94)
Education > Educational Setting > Continuing Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Goal-Conditioned Generators of Deep Policies

Faccio, Francesco, Herrmann, Vincent, Ramesh, Aditya, Kirsch, Louis, Schmidhuber, Jürgen

arXiv.org Machine LearningJul-4-2022

Goal-conditioned Reinforcement Learning (RL) aims at learning optimal policies, given goals encoded in special command inputs. Here we study goal-conditioned neural nets (NNs) that learn to generate deep NN policies in form of context-specific weight matrices, similar to Fast Weight Programmers and other methods from the 1990s. Using context commands of the form "generate a policy that achieves a desired expected return," our NN generators combine powerful exploration of parameter space with generalization across commands to iteratively find better and better policies. A form of weight-sharing HyperNetworks and policy embeddings scales our method to generate deep NNs. Experiments show how a single learned policy generator can produce policies that achieve any return seen during training. Finally, we evaluate our algorithm on a set of continuous control tasks where it exhibits competitive performance.

generator, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

2207.0157

Country:

Europe > Switzerland (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback