Appendices to " Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting "

Neural Information Processing Systems 

The hyper-parameters for NPG were manually selected by running an evaluation on the nominal task for each domain (without gravity or body part modifications).