Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation
–Neural Information Processing Systems
In reinforcement learning (RL), an agent interacts with a stochastic environment in order to maximize the total reward [Sutton and Barto, 2018].
Neural Information Processing Systems
Nov-14-2025, 03:22:40 GMT
- Country:
- North America
- Canada > Alberta (0.14)
- United States
- Massachusetts > Middlesex County
- New York > Erie County
- Buffalo (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- North America
- Genre:
- Research Report (0.46)
- Technology: