Representation-Driven Reinforcement Learning
Nabati, Ofir, Tennenholtz, Guy, Mannor, Shie
–arXiv.org Artificial Intelligence
Salimans et al. (2017) have shown that such optimization methods may We present a representation-driven framework for cause high variance updates in long horizon problems, while reinforcement learning. By representing policies Tessler et al. (2019) have shown possible convergence to as estimates of their expected values, we leverage suboptimal solutions in continuous regimes. Moreover, policy techniques from contextual bandits to guide exploration search methods are commonly sample inefficient, particularly and exploitation. Particularly, embedding in hard exploration problems, as policy gradient a policy network into a linear feature space allows methods usually converge to areas of high reward, without us to reframe the exploration-exploitation sacrificing exploration resources to achieve a far-reaching problem as a representation-exploitation problem, sparse reward.
arXiv.org Artificial Intelligence
Jun-17-2023
- Country:
- Asia > Middle East (0.14)
- North America > United States
- Hawaii (0.14)
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Leisure & Entertainment (0.46)
- Energy > Oil & Gas
- Upstream (0.67)
- Technology: