forward transfer
- North America > United States (0.14)
- Oceania > Australia > New South Wales (0.04)
- North America > Canada (0.04)
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Education (0.94)
- Health & Medicine > Consumer Health (0.42)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- North America > Canada > British Columbia > Vancouver (0.04)
- (21 more...)
- Education (0.68)
- Leisure & Entertainment > Games > Computer Games (0.46)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
A Appendix
A.1 Case study of a learned continual learner We notice that when learning a new task, the first layer ( i.e., the lowest layer) At the same time, because each task has its specific and different characteristics, the lower layers of the structure also tend to use the "mask" action to control the output value of "mask", which shows that the higher levels have the ability to combine low-dimensional features. Therefore, there are more operations to "fuse" to combine the abilities of previous tasks. With/without "mask" means use (or don't use) the "mask" action. The results are summarized in Table 5. BNS is better overall than the baseline models in terms of forward transfer. This is due to the use of reinforcement learning. Each number is the sum of all task model parameters in each setting in the final network after all tasks have been learned.Model MNIST A Table 9: Training time (minutes) used by our BNS model and all baselines in each experiment.Model MNIST A
Disentangling Transfer in Continual Reinforcement Learning Maciej Wołczyk Faculty of Mathematics and Computer Science
We adopt SAC as the underlying RL algorithm and Continual World as a suite of continuous control tasks. We systematically study how different components of SAC (the actor and the critic, exploration, and data) affect transfer efficacy, and we provide recommendations regarding various modeling options.
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Europe > Poland > Lesser Poland Province > Kraków (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- (7 more...)
- Research Report (0.46)
- Overview (0.46)
Disentangling Transfer in Continual Reinforcement Learning
We adopt SAC as the underlying RL algorithm and Continual World as a suite of continuous control tasks. We systematically study how different components of SAC (the actor and the critic, exploration, and data) affect transfer efficacy, and we provide recommendations regarding various modeling options.
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Europe > Poland > Lesser Poland Province > Kraków (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- (7 more...)
- Research Report (0.46)
- Overview (0.46)
- North America > United States (0.14)
- Oceania > Australia > New South Wales (0.04)
- North America > Canada (0.04)