Goto

Collaborating Authors

 generative memory


Learning Attractor Dynamics for Generative Memory

Neural Information Processing Systems

A central challenge faced by memory systems is the robust retrieval of a stored pattern in the presence of interference due to other stored patterns and noise. A theoretically well-founded solution to robust retrieval is given by attractor dynamics, which iteratively cleans up patterns during recall. However, incorporating attractor dynamics into modern deep learning systems poses difficulties: attractor basins are characterised by vanishing gradients, which are known to make training neural networks difficult. In this work, we exploit recent advances in variational inference and avoid the vanishing gradient problem by training a generative distributed memory with a variational lower-bound-based Lyapunov function. The model is minimalistic with surprisingly few parameters. Experiments shows it converges to correct patterns upon iterative retrieval and achieves competitive performance as both a memory model and a generative model.


Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory

Li, Mingzhuo, Li, Guang, Mao, Jiafeng, Ogawa, Takahiro, Haseyama, Miki

arXiv.org Artificial Intelligence

Dataset distillation enables the training of deep neural networks with comparable performance in significantly reduced time by compressing large datasets into small and representative ones. Although the introduction of generative models has made great achievements in this field, the distributions of their distilled datasets are not diverse enough to represent the original ones, leading to a decrease in downstream validation accuracy. In this paper, we present a diversity-driven generative dataset distillation method based on a diffusion model to solve this problem. We introduce self-adaptive memory to align the distribution between distilled and real datasets, assessing the representativeness. The degree of alignment leads the diffusion model to generate more diverse datasets during the distillation process. Extensive experiments show that our method outperforms existing state-of-the-art methods in most situations, proving its ability to tackle dataset distillation tasks.


Reviews: Learning Attractor Dynamics for Generative Memory

Neural Information Processing Systems

This paper proposes a generative model which builds on ideas from dynamical systems and previous deep learning work like the Kanerva Machine. The main idea is to design and train an architecture that, when unrolled as a dynamical system, has points from the target distribution as attractors. I found the presentation of the model reasonably clear, but thought it suffered from excessive formality. E.g., the description of p(M) could just say that the rows of M are isotropic Gaussian distributions with each row having its own mean and scaled-identity covariance. The references to matrix-variate Gaussians, Kronecker products, vectorization operators, etc. don't contribute to clarity.


DSI++: Updating Transformer Memory with New Documents

Mehta, Sanket Vaibhav, Gupta, Jai, Tay, Yi, Dehghani, Mostafa, Tran, Vinh Q., Rao, Jinfeng, Najork, Marc, Strubell, Emma, Metzler, Donald

arXiv.org Artificial Intelligence

Differentiable Search Indices (DSIs) encode a corpus of documents in model parameters and use the same model to answer user queries directly. Despite the strong performance of DSI models, deploying them in situations where the corpus changes over time is computationally expensive because reindexing the corpus requires re-training the model. In this work, we introduce DSI++, a continual learning challenge for DSI to incrementally index new documents while being able to answer queries related to both previously and newly indexed documents. Across different model scales and document identifier representations, we show that continual indexing of new documents leads to considerable forgetting of previously indexed documents. We also hypothesize and verify that the model experiences forgetting events during training, leading to unstable learning. To mitigate these issues, we investigate two approaches. The first focuses on modifying the training dynamics. Flatter minima implicitly alleviate forgetting, so we optimize for flatter loss basins and show that the model stably memorizes more documents ($+12\%$). Next, we introduce a generative memory to sample pseudo-queries for documents and supplement them during continual indexing to prevent forgetting for the retrieval task. Extensive experiments on novel continual indexing benchmarks based on Natural Questions (NQ) and MS MARCO demonstrate that our proposed solution mitigates forgetting significantly. Concretely, it improves the average Hits@10 by $+21.1\%$ over competitive baselines for NQ and requires $6$ times fewer model updates compared to re-training the DSI model for incrementally indexing five corpora in a sequence.


Probabilistic Results on the Architecture of Mathematical Reasoning Aligned by Cognitive Alternation

Li, Minzheng, Fang, Xiangzhong, Yang, Haixin

arXiv.org Artificial Intelligence

AlphaGo has made ground-breaking establishment in the large searching space problems at the game of Go[1][2]. This is followed by ChatGPT earlier this year, gaining attraction and popularity among both individuals and scientists [3][4][5][6][7][8]. We are also gratified to witness the involvement of artificial intelligence in scientific research assisting humans, summarized in the latest Nature article[9]. It is time for us to undertake the task of building a machine that is capable of solving mathematical problems and exercises. Google Research has posted two versions of pre-prints on machines of mathematical reasoning[10].


Learning Attractor Dynamics for Generative Memory

Wu, Yan, Wayne, Gregory, Gregor, Karol, Lillicrap, Timothy

Neural Information Processing Systems

A central challenge faced by memory systems is the robust retrieval of a stored pattern in the presence of interference due to other stored patterns and noise. A theoretically well-founded solution to robust retrieval is given by attractor dynamics, which iteratively cleans up patterns during recall. However, incorporating attractor dynamics into modern deep learning systems poses difficulties: attractor basins are characterised by vanishing gradients, which are known to make training neural networks difficult. In this work, we exploit recent advances in variational inference and avoid the vanishing gradient problem by training a generative distributed memory with a variational lower-bound-based Lyapunov function. The model is minimalistic with surprisingly few parameters. Experiments shows it converges to correct patterns upon iterative retrieval and achieves competitive performance as both a memory model and a generative model.


Generative Memory for Lifelong Reinforcement Learning

Raghavan, Aswin, Hostetler, Jesse, Chai, Sek

arXiv.org Artificial Intelligence

Our research is focused on understanding and applying biological memory transfers to new AI systems that can fundamentally improve their performance, throughout their fielded lifetime experience. We leverage current understanding of biological memory transfer to arrive at AI algorithms for memory consolidation and replay. In this paper, we propose the use of generative memory that can be recalled in batch samples to train a multi-task agent in a pseudo-rehearsal manner. We show results motivating the need for task-agnostic separation of latent space for the generative memory to address issues of catastrophic forgetting in lifelong learning.