The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction

Martin, Alice, Ollion, Charles, Strub, Florian, Corff, Sylvain Le, Pietquin, Olivier

Jul-15-2020–arXiv.org Machine Learning

While neural networks excel at predicting a single-point estimate of a given target for complex machine learning problems, an open research question is the design of neural generative models able to output a predictive distribution, that can capture the inherent variability of the observations or the model's level of confidence in its predictions. The main motivation behind uncertainty quantification is the design of AI systems for critical applications that are safe, and are mitigating risks while automatizing decision-making. On one hand, Bayesian statistics offer a mathematically grounded framework to reason about uncertainty; however, such models generally require prohibitive computational costs, which make them not widely used in practice. On the other hand, frequentist methods and metrics have been developed for confidence estimation in neural networks, in particular in the classification setting [Brosse et al., 2020], [Corbière et al., 2019].

deep learning, neural network, transformer, (19 more...)

arXiv.org Machine Learning

Jul-15-2020

arXiv.org PDF

Add feedback

Country:
- Europe > France (0.14)
- North America > Canada (0.14)

Genre:
- Research Report > New Finding (0.66)

Industry:
- Education (0.34)
- Health & Medicine (0.30)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Learning Graphical Models (0.94)
  - Neural Networks > Deep Learning (1.00)
  - Statistical Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found