Reviews: Variance Reduced Policy Evaluation with Smooth Function Approximation

Jan-24-2025, 04:15:39 GMT–Neural Information Processing Systems

The main contribution of this paper is in solving the finite-sum minimax problem arising from off-line policy evaluation with nonlinear function approximation. The minimax problem is non-convex in the primal variable and strong convexity in the dual subproblem, and a single time-scale algorithm is proposed to find an approximate stationary point. Although it does not address the full stochastic TD learning problem, the progress in the finite-sum off-line version is quite meaningful.

minimax problem, smooth function approximation, variance reduced policy evaluation

Neural Information Processing Systems

Jan-24-2025, 04:15:39 GMT

Conferences Web Page

Add feedback

Industry:
- Education > Focused Education > Special Education (0.37)

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.73)