Supplementary Material: Organizing recurrent network dynamics by task-computation to enable continual learning
–Neural Information Processing Systems
Instead of projecting out previously encountered inputs only (as in OWM), our proposed learning rule modifies both sides of the gradient update. We compare modifications on either side of the gradient update to demonstrate that a double-sided modification reduces forgetting. An alternative interpretation of the action of the projection matrices is in terms of slowing down the learning rate along previously explored directions in network-activity space. Our learning algorithm hence implements an adaptive learning rate schedule dependent on the total variance of activity along input/output directions on previous tasks. We used 64 trials per minibatch during training.
Neural Information Processing Systems
Nov-14-2025, 22:33:28 GMT
- Country:
- Europe > United Kingdom
- England > Greater London > London (0.04)
- North America
- Canada (0.04)
- United States > California
- Santa Clara County
- Mountain View (0.04)
- Palo Alto (0.04)
- Stanford (0.04)
- Santa Clara County
- Europe > United Kingdom
- Technology: