Positional Attention: Out-of-Distribution Generalization and Expressivity for Neural Algorithmic Reasoning
de Luca, Artur Back, Giapitzakis, George, Yang, Shenghao, Veličković, Petar, Fountoulakis, Kimon
–arXiv.org Artificial Intelligence
Transformers [Vaswani et al., 2017] are versatile models used in various applications, including vision [Yuan et al., 2021, Khan et al., 2022, Dehghani et al., 2023] and natural language processing [Wei et al., 2022b, Touvron et al., 2023]. Their effectiveness in complex tasks is particularly notable in Large Language Models (LLMs) [Wang et al., 2018, Hendrycks et al., 2021], where they excel at generating coherent text and understanding context. This strong performance has led to an increased interest in understanding the Transformer architecture as a computational model capable of executing instructions and solving algorithmic reasoning problems. In this context, Pérez et al. [2021], Wei et al. [2022a] show that Transformers are Turing Complete, and Giannou et al. [2023], Back De Luca and Fountoulakis [2024], Yang et al. [2024] demonstrate that Transformers can effectively encode instructions to solve linear algebra and graphs problems. Additionally, it has been shown that Transformers can perform reasoning tasks using far fewer layers than the number of reasoning steps [Liu et al., 2023], indicating a connection between Transformers and parallel algorithms. To this end, Sanford et al. [2024] further demonstrates that Transformers can simulate the Massively Parallel Computation (MPC) model [Andoni et al., 2018], which is based on the MapReduce framework for large-scale data processing [Dean and Ghemawat, 2008]. Complementing this theoretical framework, empirical studies have demonstrated the capabilities of Transformers, among other models, in executing algorithms [Veličković and Blundell, 2021]. Notable applications include basic arithmetic [Lee et al., 2024], sorting [Tay et al., 2020, Yan et al., 2020], dynamic programming
arXiv.org Artificial Intelligence
Oct-2-2024
- Country:
- North America > Canada (0.28)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Energy > Oil & Gas
- Midstream (0.46)
- Materials > Chemicals
- Commodity Chemicals > Petrochemicals
- LNG (0.46)
- Industrial Gases > Liquified Gas (0.46)
- Commodity Chemicals > Petrochemicals
- Energy > Oil & Gas
- Technology: