Reviews: On Controllable Sparse Alternatives to Softmax
–Neural Information Processing Systems
Note: in view of other related papers pointed out during the discussion process, I have adjusted the rating to reflect concerns over the contribution of this work. This submission presents two new methods, namely sparseflex and sparsehourglass, for mapping input vectors to the probability simplex set (unit sum vectors in the positive orthant). The main motivation is to improve the popular softmax function to induce sparse output vectors, as advocated by the sparsemax function. To this end, a general optimization framework (sparsegen) to the design of such probably mapping functions is proposed, by minimizing the mismatch to a transformation of the input penalized by negative Euclidean norm. Interestingly, it turns out that the sparsegen is equivalent to sparsemax, and it is possible to recover various existing mapping functions by choosing different transformation g(.) and penalization coefficient lambda.
Neural Information Processing Systems
Oct-7-2024, 12:04:35 GMT
- Technology: