Supplementary A Properties of the InfoGAIL

Neural Information Processing Systems 

I ( x; y; c) can be decomposed as I (x; y; c) = I ( y; x) + I ( c; x) I ( y, c; x) = I ( y; x) + I ( c; x) H (y, c) + H (y, c |x) = I ( y; c) I (y; c |x). I ( s, a; s, a) is finally increased as well. The main parameters for training Ess-InfoGAIL are listed in Table 4. To minimize computational time, we restrict the update of the latent skill distribution to only the first iteration of policy updates. Our experiments demonstrate that this approach does not result in significant performance degradation.