Escaping the Gravitational Pull of Softmax Jincheng Mei

Neural Information Processing Systems 

The softmax is the standard transformation used in machine learning to map real-valued vectors to categorical distributions.