Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis
–Neural Information Processing Systems
Recently, several work proposed to apply the variance reduction technique developed in the stochastic optimization literature to reduce the variance of TD learning.
Neural Information Processing Systems
Aug-15-2025, 16:32:31 GMT
- Country:
- North America
- Canada > Alberta (0.14)
- United States
- Utah > Salt Lake County
- Salt Lake City (0.04)
- New York > Erie County
- Buffalo (0.04)
- Utah > Salt Lake County
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.14)
- North America
- Technology: