A Appendix, we know that Q (| Z

Neural Information Processing Systems 

A.1 Algorithm for Learning Categorical Data The over all training algorithm for learning categorical generative models is similar to the other cases. A.2 Practical Algorithm We give a detailed practical algorithm. N(0,I) is a standard Gaussian noise. A simplified loss Similar to Song and Ermon [41], Ho et al. [18], we use a stochastic version of loss (15), in which we only uniformly sample temporal snapshots to compute the loss. Apply gradient descent to update .