Softmax Transformers are Turing-Complete
Jiang, Hongjian, Hahn, Michael, Zetzsche, Georg, Lin, Anthony Widjaja
–arXiv.org Artificial Intelligence
Hard attention Chain-of-Thought (CoT) transformers are known to be Turing-complete. However, it is an open problem whether softmax attention Chain-of-Thought (CoT) transformers are Turing-complete. In this paper, we prove a stronger result that length-generalizable softmax CoT transformers are Turing-complete. More precisely, our Turing-completeness proof goes via the CoT extension of the Counting RASP (C-RASP), which correspond to softmax CoT transformers that admit length generalization. We prove Turing-completeness for CoT C-RASP with causal masking over a unary alphabet (more generally, for letter-bounded languages). While we show this is not Turing-complete for arbitrary languages, we prove that its extension with relative positional encoding is Turing-complete for arbitrary languages. We empirically validate our theory by training transformers for languages requiring complex (non-linear) arithmetic reasoning.
arXiv.org Artificial Intelligence
Nov-26-2025
- Country:
- Africa > Mali (0.04)
- Asia
- Middle East > Jordan (0.04)
- Singapore (0.04)
- Europe
- Austria > Vienna (0.14)
- Germany
- Rhineland-Palatinate > Kaiserslautern (0.04)
- Saarland (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Monaco (0.04)
- Slovenia > Drava
- Municipality of Benedikt > Benedikt (0.04)
- North America > United States
- California > Los Angeles County
- Long Beach (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > Los Angeles County
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- Genre:
- Research Report (0.82)
- Technology: