e354fd90b2d5c777bfec87a352a18976-AuthorFeedback.pdf
–Neural Information Processing Systems
We thank all the reviewers for their encouraging comments. In both these cases, τ is effectively zero. Liu et al. shows how GTD-class algorithms can be formally derived using a primal-dual saddle point Sutton et al. presents a (single time-scale) variant of linear TD learning, which they call emphatic TD and show that They also provide an asymptotic convergence analysis to the set of local optima. If the paper is accepted, we will work further on improving the clarity of the work.
Neural Information Processing Systems
Jun-1-2025, 21:07:08 GMT
- Technology: