Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
–Neural Information Processing Systems
In reinforcement learning (RL), an agent interacts with a stochastic environment in order to maximize the total reward [Sutton and Barto, 2018].
Neural Information Processing Systems
Aug-14-2025, 11:44:55 GMT
- Country:
- North America
- Canada > Alberta (0.14)
- United States
- Massachusetts > Middlesex County
- New York > Erie County
- Buffalo (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- North America
- Genre:
- Research Report (0.46)
- Technology: