Language Models Encode Numbers Using Digit Representations in Base 10
–arXiv.org Artificial Intelligence
Large language models (LLMs) frequently make errors when handling even simple numerical problems, such as comparing two small numbers. A natural hypothesis is that these errors stem from how LLMs represent numbers, and specifically, whether their representations of numbers capture their numeric values. We tackle this question from the observation that LLM errors on numerical tasks are often distributed across \textit{the digits} of the answer rather than normally around \textit{its numeric value}. Through a series of probing experiments and causal interventions, we show that LLMs internally represent numbers with individual circular representations per-digit in base 10. This digit-wise representation, as opposed to a value representation, sheds light on the error patterns of models on tasks involving numerical reasoning and could serve as a basis for future studies on analyzing numerical mechanisms in LLMs.
arXiv.org Artificial Intelligence
Oct-15-2024
- Country:
- North America > United States
- California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.04)
- Asia
- Singapore (0.04)
- China (0.04)
- British Indian Ocean Territory > Diego Garcia (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Saudi Arabia > Asir Province
- Abha (0.04)
- Israel > Tel Aviv District
- Tel Aviv (0.04)
- UAE > Abu Dhabi Emirate
- Africa > Rwanda
- North America > United States
- Genre:
- Research Report > New Finding (0.68)
- Technology: