Neural Sequence Model Training via $\alpha$-divergence Minimization

Open in new window