Review for NeurIPS paper: Rankmax: An Adaptive Projection Alternative to the Softmax Function

Neural Information Processing Systems 

Strengths: * The paper is concerned with the derivation of k-argmax function's continuous approximation as a generic projection of a score vector onto the (n, 1)-simplex or the (n, k)-simplex (for predicting top-k relevant labels) based on a strongly convex function g . The first interesting contribution shows how to obtain such approximation and derives the general solution of this problem provided some properties of g (it is separable, 1-strongly convex). Relevant g s are quadratic function, negative entropy. Specifically, Euclidean projection with adapted Lipschitz constant \alpha of the projection to the training instance is devised as the Rankmax operator. The key element is that \alpha can be computed such that the sample's labels occurs in the top-k.