Finding good policies in average-reward Markov Decision Processes without prior knowledge
–Neural Information Processing Systems
Reinforcement learning (RL) is a paradigm in which an agent interacts with its environment, modeled as a Markov Decision Process (MDP), by taking actions and observing rewards.
Neural Information Processing Systems
Oct-10-2025, 16:14:20 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > France
- Hauts-de-France > Nord > Lille (0.04)
- Asia > Middle East
- Genre:
- Research Report > Experimental Study (0.93)
- Technology: