Transformer Uncertainty Estimation with Hierarchical Stochastic Attention

Open in new window