Sparse and Continuous Attention Mechanisms André F. T. Martins,, Ant os Treviso Vlad Niculae, Pe

Neural Information Processing Systems 

Exponential families are widely used in machine learning; they include many distributions in continuous and discrete domains (e.g., Gaussian, Dirichlet, Poisson, and categorical distributions via the softmax transformation). Distributions in each of these families have fixed support. In contrast, for finite domains, there has been recent work on sparse alternatives to softmax (e.g.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found