Probabilistic Transformers

Movellan, Javier R.

arXiv.org Machine Learning 

We show that Transformers are Maximum Posterior Probability estimators for Mixtures of Gaussian Models. This brings a probabilistic point of view to Transformers and suggests extensions to inference-time model adaptation and to other probabilistic cases.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found