imagination
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Variational Temporal Abstraction
Taesup Kim, Sungjin Ahn, Yoshua Bengio
There have been approaches to learn such hierarchical structure in sequences such as the HMRNN (Chung et al., 2016). However, as a deterministic model, it has the main limitation that it cannot capture the stochastic nature prevailing in the data. In particular,this is acritical limitation to imagination-augmented agents because exploring various possible futures according to the uncertainty is what makes the imagination meaningful in many cases.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China > Guangxi Province > Nanning (0.04)
OfflineReinforcementLearningwithReverse Model-basedImagination
However, in many real-world applications, collecting sufficient exploratory interactions is usually impractical, because online datacollection canbecostlyorevendangerous, suchasinhealthcare [4]andautonomous driving [5]. To address this challenge, offline RL [6, 7] develops a new learning paradigm that trains RL agents only with pre-collected offline datasets and thus can abstract away from the cost of online exploration [8-17].
OfflineReinforcementLearningwithReverse Model-basedImagination
However, in many real-world applications, collecting sufficient exploratory interactions is usually impractical, because online datacollection canbecostlyorevendangerous, suchasinhealthcare [4]andautonomous driving [5]. To address this challenge, offline RL [6, 7] develops a new learning paradigm that trains RL agents only with pre-collected offline datasets and thus can abstract away from the cost of online exploration [8-17].
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Pennsylvania (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)