The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction

Martin, Alice, Ollion, Charles, Strub, Florian, Corff, Sylvain Le, Pietquin, Olivier

arXiv.org Machine Learning 

While neural networks excel at predicting a single-point estimate of a given target for complex machine learning problems, an open research question is the design of neural generative models able to output a predictive distribution, that can capture the inherent variability of the observations or the model's level of confidence in its predictions. The main motivation behind uncertainty quantification is the design of AI systems for critical applications that are safe, and are mitigating risks while automatizing decision-making. On one hand, Bayesian statistics offer a mathematically grounded framework to reason about uncertainty; however, such models generally require prohibitive computational costs, which make them not widely used in practice. On the other hand, frequentist methods and metrics have been developed for confidence estimation in neural networks, in particular in the classification setting [Brosse et al., 2020], [Corbière et al., 2019].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found