Goto

Collaborating Authors

 recurrent network dynamic


Organizing recurrent network dynamics by task-computation to enable continual learning

Neural Information Processing Systems

Biological systems face dynamic environments that require continual learning. It is not well understood how these systems balance the tension between flexibility for learning and robustness for memory of previous behaviors. Continual learning without catastrophic interference also remains a challenging problem in machine learning. Here, we develop a novel learning rule designed to minimize interference between sequentially learned tasks in recurrent networks. Our learning rule preserves network dynamics within activity-defined subspaces used for previously learned tasks. It encourages dynamics associated with new tasks that might otherwise interfere to instead explore orthogonal subspaces, and it allows for reuse of previously established dynamical motifs where possible.


Review for NeurIPS paper: Organizing recurrent network dynamics by task-computation to enable continual learning

Neural Information Processing Systems

Summary and Contributions: This manuscript addresses the problem of continual learning in RNN. The authors propose a new learning rule that allows to organize the dynamics for different tasks into orthogonal subspaces. Using a set of neuroscience tasks, they show how this learning rule allows to avoid catastrophic interferences between tasks. By analyzing the dynamics of trained networks they provide evidence for why their learning rule is successful, it also allows them to discuss the problem of transfer learning. Strengths: - propose a new original solution to the problem of continual learning, which also allows them to address and understand under which conditions learning in one task can be transfered to learning off another task.


Review for NeurIPS paper: Organizing recurrent network dynamics by task-computation to enable continual learning

Neural Information Processing Systems

The reviewers generally agree that this paper offers a novel viewpoint on avoiding catastrophic forgetting. The theoretical and experimental results are well received. R3 would have preferred to see a deeper discussion on the differences with OWM. However, the authors explained during the rebuttal that their learning rule modifies both sides of the gradient update, differently to OWM. This characteristic, together with the intricacies involved in considering a sequential application, makes the overall contribution significant enough.


Organizing recurrent network dynamics by task-computation to enable continual learning

Neural Information Processing Systems

Biological systems face dynamic environments that require continual learning. It is not well understood how these systems balance the tension between flexibility for learning and robustness for memory of previous behaviors. Continual learning without catastrophic interference also remains a challenging problem in machine learning. Here, we develop a novel learning rule designed to minimize interference between sequentially learned tasks in recurrent networks. Our learning rule preserves network dynamics within activity-defined subspaces used for previously learned tasks. It encourages dynamics associated with new tasks that might otherwise interfere to instead explore orthogonal subspaces, and it allows for reuse of previously established dynamical motifs where possible.


Fool's Gold: Extracting Finite State Machines from Recurrent Network Dynamics

Neural Information Processing Systems

Several recurrent networks have been proposed as representations for the task of formal language learning. After training a recurrent network rec(cid:173) ognize a formal language or predict the next symbol of a sequence, the next logical step is to understand the information processing carried out by the network. Some researchers have begun to extracting finite state machines from the internal state trajectories of their recurrent networks. This paper describes how sensitivity to initial conditions and discrete measurements can trick these extraction methods to return illusory finite state descriptions.


Fool's Gold: Extracting Finite State Machines from Recurrent Network Dynamics

Kolen, John F.

Neural Information Processing Systems

Several recurrent networks have been proposed as representations for the task of formal language learning. After training a recurrent network recognize a formal language or predict the next symbol of a sequence, the next logical step is to understand the information processing carried out by the network. Some researchers have begun to extracting finite state machines from the internal state trajectories of their recurrent networks. This paper describes how sensitivity to initial conditions and discrete measurements can trick these extraction methods to return illusory finite state descriptions.


Fool's Gold: Extracting Finite State Machines from Recurrent Network Dynamics

Kolen, John F.

Neural Information Processing Systems

Several recurrent networks have been proposed as representations for the task of formal language learning. After training a recurrent network recognize aformal language or predict the next symbol of a sequence, the next logical step is to understand the information processing carried out by the network. Some researchers have begun to extracting finite state machines from the internal state trajectories of their recurrent networks. This paper describes how sensitivity to initial conditions and discrete measurements can trick these extraction methods to return illusory finite state descriptions.


Fool's Gold: Extracting Finite State Machines from Recurrent Network Dynamics

Kolen, John F.

Neural Information Processing Systems

Several recurrent networks have been proposed as representations for the task of formal language learning. After training a recurrent network recognize a formal language or predict the next symbol of a sequence, the next logical step is to understand the information processing carried out by the network. Some researchers have begun to extracting finite state machines from the internal state trajectories of their recurrent networks. This paper describes how sensitivity to initial conditions and discrete measurements can trick these extraction methods to return illusory finite state descriptions.