NearOptimalExploration-Exploitationin Non-CommunicatingMarkovDecisionProcesses
–Neural Information Processing Systems
Reinforcement learning (RL) [1] studies the problem of learning in sequential decision-making problems where the dynamics of the environment is unknown, but can be learnt by performing actions andobserving their outcome inanonline fashion. Asample-efficient RLagent must trade off the explorationneeded to collect information about the environment, and theexploitation of the experience gathered so far to gain as much reward as possible.
Neural Information Processing Systems
Feb-12-2026, 15:41:04 GMT
- Country:
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- Georgia > Fulton County
- Atlanta (0.04)
- Virginia > Arlington County
- Arlington (0.04)
- Georgia > Fulton County
- Canada > Quebec
- North America
- Technology: