Risk-Aware Transfer in Reinforcement Learning using Successor Features Supplementary Material

Neural Information Processing Systems 

Pseudocode adapted for the total-reward episodic MDP setting is given as Algorithm 1. Please note