Goto

Collaborating Authors

 curl





Continual Unsupervised Representation Learning

Neural Information Processing Systems

Continual learning aims to improve the ability of modern learning systems to deal with non-stationary distributions, typically by attempting to learn a series of tasks sequentially. Prior art in the field has largely considered supervised or reinforcement learning tasks, and often assumes full knowledge of task labels and boundaries. In this work, we propose an approach (CURL) to tackle a more general problem that we will refer to as unsupervised continual learning. The focus is on learning representations without any knowledge about task identity, and we explore scenarios when there are abrupt changes between tasks, smooth transitions from one task to another, or even when the data is shuffled. The proposed approach performs task inference directly within the model, is able to dynamically expand to capture new concepts over its lifetime, and incorporates additional rehearsal-based techniques to deal with catastrophic forgetting. We demonstrate the efficacy of CURL in an unsupervised learning setting with MNIST and Omniglot, where the lack of labels ensures no information is leaked about the task. Further, we demonstrate strong performance compared to prior art in an i.i.d setting, or when adapting the technique to supervised tasks such as incremental class learning.






Kernel-Based Sparse Additive Nonlinear Model Structure Detection through a Linearization Approach

arXiv.org Machine Learning

The choice of parameterization in Nonlinear (NL) system models greatly affects the quality of the estimated model. Overly complex models can be impractical and hard to interpret, necessitating data-driven methods for simpler and more accurate representations. In this paper, we propose a data-driven approach to simplify a class of continuous-time NL system models using linear approximations around varying operating points. Specifically, for sparse additive NL models, our method identifies the number of NL subterms and their corresponding input spaces. Under small-signal operation, we approximate the unknown NL system as a trajectory-scheduled Linear Parameter-Varying (LPV) system, with LPV coefficients representing the gradient of the NL function and indicating input sensitivity. Using this sensitivity measure, we determine the NL system's structure through LPV model reduction by identifying non-zero LPV coefficients and selecting scheduling parameters. We introduce two sparse estimators within a vector-valued Reproducing Kernel Hilbert Space (RKHS) framework to estimate the LPV coefficients while preserving their structural relationships. The structure of the sparse additive NL model is then determined by detecting non-zero elements in the gradient vector (LPV coefficients) and the Hessian matrix (Jacobian of the LPV coefficients). We propose two computationally tractable RKHS-based estimators for this purpose. The sparsified Hessian matrix reveals the NL model's structure, with numerical simulations confirming the approach's effectiveness.


rQdia: Regularizing Q-Value Distributions With Image Augmentation

arXiv.org Artificial Intelligence

With a simple auxiliary loss, that equalizes these distributions via MSE, rQdia boosts DrQ and SAC on 9/ 12 and 10 /12 tasks respectively in the MuJoCo Continuous Control Suite from pixels, and Data-Efficient Rainbow on 18/ 26 Atari Arcade environments. Gains are measured in both sample efficiency and longer-term training. Human perception is invariant to and remarkably robust against many perturbations, like discoloration, obfuscation, and low exposure. On the other hand, artificial neural networks do not intrinsically carry these invariance properties, though some invariances may be induced architecturally through inductive biases like convolution, kernel rotation, and dilation. In deep reinforcement learning (RL) from pixels, an agent is tasked to learn from raw pixels and must therefore learn to visually interpret a scene. Thus, recent approaches in deep RL have turned to the self-supervision and data augmentation techniques found in computer vision.