AITopics | benford

Benford's Curse: Tracing Digit Bias to Numerical Hallucination in LLMs

Neural Information Processing SystemsJun-21-2026, 13:51:30 GMT

Large Language Models (LLMs) exhibit impressive performance on complex reasoning tasks, yet they frequently fail on basic numerical problems, producing incorrect outputs. Inspired by Benford's Law, a statistical pattern in which lower digits occur more frequently as leading digits, we hypothesize that the skewed digit distributions in web-collected corpora may be learned by LLMs during pretraining, leading to biased numerical generation. To investigate the hypothesis, we first examine whether digits frequencies in pretraining corpus (OLMo2) follows Benford's law. We then construct an evaluation benchmark in which the ground-truth digits are uniformly distributed within each of the seven numerical reasoning tasks. Our evaluation results demonstrate that leading open-source LLMs show a consistent pattern of digit bias that resembles Benford's law. Through logit-lens tracing and neuron-level dissection, we identify that this bias arises predominantly from a small subset of highly digit-selective feed-forward network (FFN) neurons in the deeper layers. Finally, we demonstrate that pruning these neurons mitigates imbalanced overgeneration and partially corrects erroneous outputs, providing causal evidence that fine-grained pretraining digit bias can propagate into model behavior. Our findings reveal a fundamental connection between corpus-level statistics and symbolic failure modes in LLMs, offering a new lens for diagnosing and mitigating hallucinations in numerical tasks.

artificial intelligence, large language model, natural language, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Benford's Curse: Tracing Digit Bias to Numerical Hallucination in LLMs

Neural Information Processing SystemsJun-13-2026, 18:56:03 GMT

Large Language Models (LLMs) exhibit impressive performance on complex reasoning tasks, yet they frequently fail on basic numerical problems, producing incorrect outputs. Inspired by Benford's Law, a statistical pattern in which lower digits occur more frequently as leading digits, we hypothesize that the skewed digit distributions in web-collected corpora may be learned by LLMs during pretraining, leading to biased numerical generation. To investigate the hypothesis, we first examine whether digits frequencies in pretraining corpus (OLMo2) follows Benford's law. We then construct an evaluation benchmark in which the ground-truth digits are uniformly distributed within each of the seven numerical reasoning tasks. Our evaluation results demonstrate that leading open-source LLMs show a consistent pattern of digit bias that resembles Benford's law. Through logit-lens tracing and neuron-level dissection, we identify that this bias arises predominantly from a small subset of highly digit-selective feed-forward network (FFN) neurons in the deeper layers. Finally, we demonstrate that pruning these neurons mitigates imbalanced overgeneration and partially corrects erroneous outputs, providing causal evidence that fine-grained pretraining digit bias can propagate into model behavior. Our findings reveal a fundamental connection between corpus-level statistics and symbolic failure modes in LLMs, offering a new lens for diagnosing and mitigating hallucinations in numerical tasks.

artificial intelligence, large language model, natural language, (7 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Benford's Curse: Tracing Digit Bias to Numerical Hallucination in LLMs

Shao, Jiandong, Lu, Yao, Yang, Jianfei

arXiv.org Artificial IntelligenceDec-1-2025

Large Language Models (LLMs) exhibit impressive performance on complex reasoning tasks, yet they frequently fail on basic numerical problems, producing incorrect outputs. Inspired by Benford's Law, a statistical pattern in which lower digits occur more frequently as leading digits, we hypothesize that the skewed digit distributions in web-collected corpora may be learned by LLMs during pretraining, leading to biased numerical generation. To investigate the hypothesis, we first examine whether digits frequencies in pretraining corpus (OLMo2) follows Benford's law. We then construct an evaluation benchmark in which the ground-truth digits are uniformly distributed within each of the seven numerical reasoning tasks. Our evaluation results demonstrate that leading open-source LLMs show a consistent pattern of digit bias that resembles Benford's law. Through logit-lens tracing and neuron-level dissection, we identify that this bias arises predominantly from a small subset of highly digit-selective feed-forward network (FFN) neurons in the deeper layers. Finally, we demonstrate that pruning these neurons mitigates imbalanced overgeneration and partially corrects erroneous outputs, providing causal evidence that fine-grained pretraining digit bias can propagate into model behavior. Our findings reveal a fundamental connection between corpus-level statistics and symbolic failure modes in LLMs, offering a new lens for diagnosing and mitigating hallucinations in numerical tasks.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.01734

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Structural Foundations for Leading Digit Laws: Beyond Probabilistic Mixtures

Berman, Vladimir

arXiv.org Machine LearningAug-20-2025

This article presents a modern deterministic framework for the study of leading significant digit distributions in numerical data. Rather than relying on traditional probabilistic or mixture-based explanations, we demonstrate that the observed frequencies of leading digits are determined by the underlying arithmetic, algorithmic, and structural properties of the data-generating process. Our approach centers on a shift-invariant functional equation, whose general solution is given by explicit affine-plus-periodic formulas. This structural formulation explains the diversity of digit distributions encountered in both empirical and mathematical datasets, including cases with pronounced deviations from logarithmic or scale-invariant profiles. We systematically analyze digit distributions in finite and infinite datasets, address deterministic sequences such as prime numbers and recurrence relations, and highlight the emergence of block-structured and fractal features. The article provides critical examination of probabilistic models, explicit examples and counterexamples, and discusses limitations and open problems for further research. Overall, this work establishes a unified mathematical foundation for digital phenomena and offers a versatile toolset for modeling and analyzing digit patterns in applied and theoretical contexts.

artificial intelligence, benford, log 10, (18 more...)

arXiv.org Machine Learning

2508.13237

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Add feedback

Analyzing The Language of Visual Tokens

Chan, David M., Corona, Rodolfo, Park, Joonyong, Cho, Cheol Jun, Bai, Yutong, Darrell, Trevor

arXiv.org Artificial IntelligenceNov-7-2024

With the introduction of transformer-based models for vision and language tasks, such as LLaVA and Chameleon, there has been renewed interest in the discrete tokenized representation of images. These models often treat image patches as discrete tokens, analogous to words in natural language, learning joint alignments between visual and human languages. However, little is known about the statistical behavior of these visual languages - whether they follow similar frequency distributions, grammatical structures, or topologies as natural languages. In this paper, we take a natural-language-centric approach to analyzing discrete visual languages and uncover striking similarities and fundamental differences. We demonstrate that, although visual languages adhere to Zipfian distributions, higher token innovation drives greater entropy and lower compression, with tokens predominantly representing object parts, indicating intermediate granularity. We also show that visual languages lack cohesive grammatical structures, leading to higher perplexity and weaker hierarchical organization compared to natural languages. Finally, we demonstrate that, while vision models align more closely with natural languages than other models, this alignment remains significantly weaker than the cohesion found within natural languages. Through these experiments, we demonstrate how understanding the statistical properties of discrete visual languages can inform the design of more effective computer vision models.

frequency, natural language, vq-vae, (15 more...)

arXiv.org Artificial Intelligence

2411.05001

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(6 more...)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

How Artists Improvise and Provoke Robotics

Benford, Steve, Garrett, Rachael, Schneiders, Eike, Tennent, Paul, Chamberlain, Alan, Avila, Juan, Brundell, Pat, Castle-Green, Simon

arXiv.org Artificial IntelligenceOct-29-2024

We explore transdisciplinary collaborations between artists and roboticists across a portfolio of artworks. Brendan Walker's Broncomatic was a breath controlled mechanical rodeo bull ride. Blast Theory's Cat Royale deployed a robot arm to play with a family of three cats for twelve days. Different Bodies is a prototype improvised dance performance in which dancers with disabilities physically manipulate two mirrored robot arms. We reflect on these to explore how artists shape robotics research through the two key strategies of improvisation and provocation. Artists are skilled at improvising extended robot experiences that surface opportunities for technology-focused design, but which also require researchers to improvise their research processes. Artists may provoke audiences into reflecting on the societal implications of robots, but at the same time challenge the established techno-centric concepts, methods and underlying epistemology of robotics research.

artificial intelligence, artist, robot, (17 more...)

arXiv.org Artificial Intelligence

2410.22462

Country:

Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
North America > United States > New York > New York County > New York City (0.05)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

On the Detection of Anomalous or Out-Of-Distribution Data in Vision Models Using Statistical Techniques

O'Mahony, Laura, O'Sullivan, David JP, Nikolov, Nikola S.

arXiv.org Artificial IntelligenceMar-21-2024

Out-of-distribution data and anomalous inputs are vulnerabilities of machine learning systems today, often causing systems to make incorrect predictions. The diverse range of data on which these models are used makes detecting atypical inputs a difficult and important task. We assess a tool, Benford's law, as a method used to quantify the difference between real and corrupted inputs. We believe that in many settings, it could function as a filter for anomalous data points and for signalling out-of-distribution data. We hope to open a discussion on these applications and further areas where this technique is underexplored.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2403.15497

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Ireland (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.64)

Industry: Law Enforcement & Public Safety > Fraud (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

Crypto Wash Trading: Direct vs. Indirect Estimation

Falk, Brett Hemenway, Tsoukalas, Gerry, Zhang, Niuniu

arXiv.org Artificial IntelligenceNov-30-2023

Recent studies using indirect statistical methods estimate that around 70% of traded value on centralized crypto exchanges like Binance, can be characterized as wash trading. This paper turns to NFT markets, where transaction transparency, including analysis of roundtrip trades and common wallet activities, allows for more accurate direct estimation methods to be applied. We find roughly 30% of NFT volume and between 45-95% of traded value, involve wash trading. More importantly, our approach enables a critical evaluation of common indirect estimation methods used in the literature. We find major differences in their effectiveness; some failing entirely. Roundedness filters, like those used in Cong et al. (2023), emerge as the most accurate. In fact, the two approaches can be closely aligned via hyper-parameter optimization if direct data is available.

transaction, wash trade, wash trading, (16 more...)

arXiv.org Artificial Intelligence

2311.18717

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Banking & Finance > Trading (1.00)
Information Technology > Services > e-Commerce Services (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > e-Commerce > Financial Technology (0.90)
(2 more...)

Add feedback

Does human speech follow Benford's Law?

Hsu, Leo, Berisha, Visar

arXiv.org Artificial IntelligenceDec-21-2022

Researchers have observed that the frequencies of leading digits in many man-made and naturally occurring datasets follow a logarithmic curve, with digits that start with the number 1 accounting for $\sim 30\%$ of all numbers in the dataset and digits that start with the number 9 accounting for $\sim 5\%$ of all numbers in the dataset. This phenomenon, known as Benford's Law, is highly repeatable and appears in lists of numbers from electricity bills, stock prices, tax returns, house prices, death rates, lengths of rivers, and naturally occurring images. In this paper we demonstrate that human speech spectra also follow Benford's Law on average. That is, when averaged over many speakers, the frequencies of leading digits in speech magnitude spectra follow this distribution, although with some variability at the individual sample level. We use this observation to motivate a new set of features that can be efficiently extracted from speech and demonstrate that these features can be used to classify between human speech and synthetic speech.

artificial intelligence, benford, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2203.13352

Country:

North America > United States > Arizona > Maricopa County > Tempe (0.04)
Asia > India (0.04)

Genre: Research Report (0.83)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

3 Mathematical Laws Data Scientists Need To Know - KDnuggets

#artificialintelligenceMar-2-2021, 17:36:21 GMT

While a Data Scientist works with data as their main activity, it doesn't mean that mathematical knowledge is something we do not need. Data scientists need to learn and understand the mathematical theory behind machine learning to efficiently solving business problems. The mathematics behind machine learning is not just a random notation thrown here and there, but it consists of many theories and thoughts. This thought creates a lot of mathematical laws that contribute to the machine learning we can use right now. Although you could use the mathematics in any way you want to solve the problem, mathematical laws are not limited to machine learning after all.

digit, mathematical law data scientist, probability, (13 more...)

#artificialintelligence

Technology: