Transformers are Universal Predictors

Basu, Sourya, Choraria, Moulik, Varshney, Lav R.

Jul-15-2023–arXiv.org Artificial Intelligence

In We find limits to the Transformer architecture for this sense, the Transformer architecture is said to have a universal language modeling and show it has a universal computation property (Lu et al., 2021), reminiscent prediction property in an information-theoretic of predictive coding hypotheses of the brain that posit one sense. We further analyze performance in nonasymptotic basic operation in neurobiological information processing data regimes to understand the role (Golkar et al., 2022). of various components of the Transformer architecture, The basic predictive workings of Transformers and previous especially in the context of data-efficient findings of universal approximation and computation training. We validate our theoretical analysis with properties motivate us to ask whether they also have a universal experiments on both synthetic and real datasets.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Jul-15-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Hawaii (0.14)
  - Illinois (0.14)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found