Checklist

Feb-18-2024, 04:15:31 GMT–Neural Information Processing Systems

In the main text we present the TD and ETD algorithms for policy evaluation under linear function approximation, as a way to recognize the existing literature on emphatic algorithms [27]. We here present the derivation for policy evaluation under general function approximation. Following standard notation [41], capital letters for states, actions or rewards represent the random variable at time t (i.e. S

interest function, objective, value function, (15 more...)

Neural Information Processing Systems

Feb-18-2024, 04:15:31 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (0.69)
  - Neural Networks (0.47)