Review for NeurIPS paper: Differentiable Top-k with Optimal Transport

Neural Information Processing Systems 

Additional Feedback: Some comments: l. 117 The entropic OT is surely not more computationally friendly than a a top-k operator that simply sorts the vector. Same for the beam-search method, the present work seems to be a sequence of ad-hoc definitions rather than a principled objective. In particular it is important to make the optimization objective clear to enable future comparisons. Can the authors clearly distinguish their contributions from the ones of Cuturi et al, 2019? It seems that the implementation of the authors is potentially faster, which should be highlighted.