Reviews: Fast Structured Decoding for Sequence Models

Neural Information Processing Systems 

The reviewers and I find the paper interesting, especially because such a simple approach performs favorably in comparison with non-autoregressive and expressive autoregressive models for machine translation. I recommend acceptance as a poster given that the reviewers raise several concerns about the original manuscript. I ask the authors to change the title as agreed in the rebuttal by using terms such as low-latency, fast, etc. It seems that the paper uses approximate partition function for training which is is not explained in details. The theoretical properties of such an approximation may be interesting to study.