Goto

Collaborating Authors

 taxnodes:Technology: Instructional Materials


Meta-Reinforcement Learning with Self-Modifying Networks

Neural Information Processing Systems

Deep Reinforcement Learning has demonstrated the potential of neural networks tuned with gradient descent for solving complex tasks in well-delimited environments. However, these neural systems are slow learners producing specialized agents with no mechanism to continue learning beyond their training curriculum. On the contrary, biological synaptic plasticity is persistent and manifold, and has been hypothesized to play a key role in executive functions such as working memory and cognitive flexibility, potentially supporting more efficient and generic learning abilities. Inspired by this, we propose to build networks with dynamic weights, able to continually perform self-reflexive modification as a function of their current synaptic state and action-reward feedback, rather than a fixed network configuration. The resulting model, MetODS (for Meta-Optimized Dynamical Synapses) is a broadly applicable meta-reinforcement learning system able to learn efficient and powerful control rules in the agent policy space. A single layer with dynamic synapses can perform one-shot learning, generalizes navigation principles to unseen environments and manifests a strong ability to learn adaptive motor policies.


DOBF: A Deobfuscation Pre-Training Objective for Programming Languages

Neural Information Processing Systems

Recent advances in self-supervised learning have dramatically improved the state of the art on a wide variety of tasks. However, research in language model pretraining has mostly focused on natural languages, and it is unclear whether models like BERT and its variants provide the best pre-training when applied to other modalities, such as source code. In this paper, we introduce a new pre-training objective, DOBF, that leverages the structural aspect of programming languages and pre-trains a model to recover the original version of obfuscated source code. We show that models pre-trained with DOBF significantly outperform existing approaches on multiple downstream tasks, providing relative improvements of up to 12.2% in unsupervised code translation, and 5.3% in natural language code search. Incidentally, we found that our pre-trained model is able to deobfuscate fully obfuscated source files, and to suggest descriptive variable names.


Speeding up design and making to reduce time-to-project and time-to-market: an AI-Enhanced approach in engineering education

arXiv.org Artificial Intelligence

This paper explores the integration of AI tools, such as ChatGPT and GitHub Copilot, in the Software Architecture for Embedded Systems course. AI-supported workflows enabled students to rapidly prototype complex projects, emphasizing real-world applications like SLAM robotics. Results demon-started enhanced problem-solving, faster development, and more sophisticated outcomes, with AI augmenting but not replacing human decision-making.


Online Meta-Learning via Learning with Layer-Distributed Memory

Neural Information Processing Systems

We demonstrate that efficient meta-learning can be achieved via end-to-end training of deep neural networks with memory distributed across layers. The persistent state of this memory assumes the entire burden of guiding task adaptation. Moreover, its distributed nature is instrumental in orchestrating adaptation. Ablation experiments demonstrate that providing relevant feedback to memory units distributed across the depth of the network enables them to guide adaptation throughout the entire network. Our results show that this is a successful strategy for simplifying metalearning - often cast as a bi-level optimization problem - to standard end-to-end training, while outperforming gradient-based, prototype-based, and other memorybased meta-learning strategies. Additionally, our adaptation strategy naturally handles online learning scenarios with a significant delay between observing a sample and its corresponding label - a setting in which other approaches struggle. Adaptation via distributed memory is effective across a wide range of learning tasks, ranging from classification to online few-shot semantic segmentation.


Supplementary Materials for: Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State

Neural Information Processing Systems

Figure 1 illustrates our feedback models with single-layer and multi-layer structure as indicated in Sections 4.1 and 4.3. We present the pseudocode of one iteration of IDE training in Algorithm 1 to better illustrate our training method. Input: Network parameters ฮธ; Input data x; Label y; Time steps T; Other hyperparameters; Output: Trained network parameters ฮธ. Simulate the SNN by T time steps with input x based on Eq. (2) and calculate the final (weighted) average firing rate a[T ]; Calculate the output o and the loss L based on o and y. Update ฮธ based on the gradient-based optimizer.



Minimax Optimal Online Imitation Learning via Replay Estimation

Neural Information Processing Systems

Online imitation learning is the problem of how best to mimic expert demonstrations, given access to the environment or an accurate simulator. Prior work has shown that in the infinite sample regime, exact moment matching achieves value equivalence to the expert policy.


Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling

Neural Information Processing Systems

We show that samples can be generated from this modified density by sampling in latent space according to an energy-based model induced by the sum of the latent prior log-density and the discriminator output score. We call this process of running Markov Chain Monte Carlo in the latent space, and then applying the generator function, Discriminator Driven Latent Sampling (DDLS). We show that DDLS is highly efficient compared to previous methods which work in the high-dimensional pixel space, and can be applied to improve on previously trained GANs of many types. We evaluate DDLS on both synthetic and real-world datasets qualitatively and quantitatively. On CIFAR-10, DDLS substantially improves the Inception Score of an off-the-shelf pre-trained SN-GAN [1] from 8.22 to 9.09 which is comparable to the class-conditional BigGAN [2] model. This achieves a new state-of-the-art in the unconditional image synthesis setting without introducing extra parameters or additional training.