Review for NeurIPS paper: Cascaded Text Generation with Markov Transformers
–Neural Information Processing Systems
Weaknesses: While I am advocating for this paper's acceptance, I'm curious as to whether the authors think this will truly be the dominant approach going forward in this area. I find this approach theoretically more appealing than the Levenshtein transformer, but I think the "global communication" as a negative feature of that model isn't strictly a negative. Sure, the more local nature of this one gives a speedup. But successfully capturing long-range dependencies is one of the things transformer models like GPT-3 seem to be good at. This is a limitation of the paper only evaluating on MT; in MT, the input heavily constrains the shape of the output and long-range output dependencies may not be quite as necessary.
Neural Information Processing Systems
Jan-21-2025, 03:08:11 GMT
- Technology: