SupplementaryMaterialforRethinkingValue FunctionLearningforGeneralizationin ReinforcementLearning

Feb-12-2026, 10:27:47 GMT–Neural Information Processing Systems

Then,wecalculatethe mean stiffness of the value network across all state pairs and report its average computed over all trainingepochs. Eachagentis trained on 200 training levels for 25M environment steps. The mean and standard deviation are computedover10differentruns. Morespecifically,wecollect100 training episodes throughout the training and evaluate the value network prediction for the initial stateofeachtrajectory. Each agent is trained on 200 training levels for 25M environment steps.

machine learning, optimizevalueobjectivejv, reinforcement learning, (18 more...)

Neural Information Processing Systems

Feb-12-2026, 10:27:47 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Duplicate Docs Excel Report

Title
Supplementary Material for Rethinking Value Function Learning for Generalization in Reinforcement Learning A Stiffness Analysis

Similar Docs Excel Report more

Title	Similarity	Source
None found