Reviews: SVD-Softmax: Fast Softmax Approximation on Large Vocabulary Neural Networks
–Neural Information Processing Systems
This paper proposes an efficient way to approximate the softmax computation for large vocabulary applications. The idea is to decompose the output matrix with singular value decomposition. Then, by selecting the most important singular values you select the most probable words and also compute the partition function for a limited amount of words. These are supposed to contribute for the most part of the sum. For the remaining words, their contributions to the partition function is only approximated.
Neural Information Processing Systems
Oct-7-2024, 21:14:11 GMT
- Technology: