Appendices to " Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting "