A mathematical perspective on Transformers
Geshkovski, Borjan, Letrouit, Cyril, Polyanskiy, Yury, Rigollet, Philippe
–arXiv.org Artificial Intelligence
Transformers play a central role in the inner workings of large language models. We develop a mathematical framework for analyzing Transformers based on their interpretation as interacting particle systems, which reveals that clusters emerge in long time. Our study explores the underlying theory and offers new perspectives for mathematicians as well as computer scientists.
arXiv.org Artificial Intelligence
Feb-6-2024
- Country:
- Genre:
- Research Report > New Finding (1.00)
- Technology: