Reviews: Sequence Modeling with Unconstrained Generation Order

Neural Information Processing Systems 

Updated review: The authors have indicated that they will run additional experiments and make the clarifications I requested, so I will raise my score in 7 in agreement with the other reviews leading to an "accept" consensus. However, I do note that in their rebuttal the authors describe Gu et al., Stern et al., and Welleck et al. as "concurrent work". To be totally clear, all three of those papers were posted to arxiv in early February; the NeurIPS deadline was over 3 months later and it is now 6 months after they papers appeared online. I would argue that 3 (or 6) months is long enough to provide a more direct comparison and would not consider this submission "concurrent work". I don't think this warrants rejecting the paper, but I do want to note that I disagree with the authors here and still believe that a more direct comparison is appropriate.