Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
–Neural Information Processing Systems
Recently, several work proposed to apply the variance reduction technique developed in the stochastic optimization literature to reduce the variance of TD learning.
Neural Information Processing Systems
Aug-15-2025, 16:32:31 GMT
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.14)
- North America
- Canada > Alberta (0.14)
- United States
- New York > Erie County
- Buffalo (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- New York > Erie County
- Europe > United Kingdom
- Technology: