A Additional Implementation Details Warming-up the KL Term: Similar to the previous work, we warm-up the KL term at the beginning

Neural Information Processing Systems 

In the second scale, we have 10 groups of 8 8 20 -dimensional variables.