Reviews: Sigsoftmax: Reanalysis of the Softmax Bottleneck
–Neural Information Processing Systems
The paper analyzes ability of the soft-max, if used as the output activation function in NN, to approximate posterior distribution. The problem is translated to the study of the rank of the matrices contating the log-probabilities computed by the analyzed activation layer. It is shown that the soft-max does not increases the rank of the input response matrix (i.e. The authors propose to replace soft-max by the so called sigsoftmax (i.e. It is shown that the rank of sigsoftmax matrix is not less the rank of soft-max.
Neural Information Processing Systems
Oct-7-2024, 21:31:11 GMT
- Technology: