Rankmax: An Adaptive Projection Alternative to the Softmax Function
–Neural Information Processing Systems
Many machine learning models involve mapping a score vector to a probability vector. Usually, this is done by projecting the score vector onto a probability simplex, and such projections are often characterized as Lipschitz continuous approximations of the argmax function, whose Lipschitz constant is controlled by a parameter that is similar to a softmax temperature.
Neural Information Processing Systems
Oct-1-2025, 23:03:21 GMT