The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction

Open in new window