Appendices to " Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting "
–Neural Information Processing Systems
The hyper-parameters for NPG were manually selected by running an evaluation on the nominal task for each domain (without gravity or body part modifications).
Neural Information Processing Systems
Nov-14-2025, 22:34:41 GMT
- Technology: