Supplementary Material for Rethinking Value Function Learning for Generalization in Reinforcement Learning A Stiffness Analysis

Aug-19-2025, 13:06:37 GMT–Neural Information Processing Systems

The green lines in Figure 1 demonstrate that the stiffness decreases as the number of training levels increases in most of the Procgen games. This suggests that the delayed critic update effectively alleviates the memorization problem. Each agent is trained on 200 training levels for 25M environment steps. Each agent is trained for 8M environment steps. The mean is computed over 10 different runs.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Aug-19-2025, 13:06:37 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (0.65)
  - Neural Networks > Deep Learning (0.47)

Duplicate Docs Excel Report

Title
SupplementaryMaterialforRethinkingValue FunctionLearningforGeneralizationin ReinforcementLearning

Similar Docs Excel Report more

Title	Similarity	Source
None found