On Controllable Sparse Alternatives to Softmax
Anirban Laha, Saneem Ahmed Chemmengath, Priyanka Agrawal, Mitesh Khapra, Karthik Sankaranarayanan, Harish G. Ramaswamy
–Neural Information Processing Systems
Even though softmax is the most prevalent approach amongst them, it has a shortcoming in that its outputs are composed of only non-zeroes and is therefore ill-suited for producing sparse probability distributions as output.
Neural Information Processing Systems
Nov-17-2025, 21:38:19 GMT
- Country:
- Europe > Portugal
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- California > San Diego County
- San Diego (0.04)
- Massachusetts
- Middlesex County > Cambridge (0.04)
- Plymouth County > Hanover (0.04)
- California > San Diego County
- Canada > Quebec
- Technology: