Exploration in Structured Reinforcement Learning
Jungseul Ok, Alexandre Proutiere, Damianos Tranos
–Neural Information Processing Systems
Hence, with largestate and action spaces, it is essential to identify and exploit any possible structure existing in the system dynamics and reward function so as to minimize exploration phases and in turn reduce regret to reasonable values. Modern RL algorithms actually implicitly impose some structural properties either in the model parameters (transition probabilities and reward function, see e.g.
Neural Information Processing Systems
Feb-14-2026, 19:22:48 GMT
- Country:
- Europe > Sweden
- North America
- Canada > Quebec
- Montreal (0.04)
- United States > Illinois (0.04)
- Canada > Quebec
- Technology: