benford
Benford's Curse: Tracing Digit Bias to Numerical Hallucination in LLMs
Shao, Jiandong, Lu, Yao, Yang, Jianfei
Large Language Models (LLMs) exhibit impressive performance on complex reasoning tasks, yet they frequently fail on basic numerical problems, producing incorrect outputs. Inspired by Benford's Law, a statistical pattern in which lower digits occur more frequently as leading digits, we hypothesize that the skewed digit distributions in web-collected corpora may be learned by LLMs during pretraining, leading to biased numerical generation. To investigate the hypothesis, we first examine whether digits frequencies in pretraining corpus (OLMo2) follows Benford's law. We then construct an evaluation benchmark in which the ground-truth digits are uniformly distributed within each of the seven numerical reasoning tasks. Our evaluation results demonstrate that leading open-source LLMs show a consistent pattern of digit bias that resembles Benford's law. Through logit-lens tracing and neuron-level dissection, we identify that this bias arises predominantly from a small subset of highly digit-selective feed-forward network (FFN) neurons in the deeper layers. Finally, we demonstrate that pruning these neurons mitigates imbalanced overgeneration and partially corrects erroneous outputs, providing causal evidence that fine-grained pretraining digit bias can propagate into model behavior. Our findings reveal a fundamental connection between corpus-level statistics and symbolic failure modes in LLMs, offering a new lens for diagnosing and mitigating hallucinations in numerical tasks.
- Asia > Middle East > Jordan (0.04)
- South America > Colombia > Meta Department > Villavicencio (0.04)
- North America > United States (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Structural Foundations for Leading Digit Laws: Beyond Probabilistic Mixtures
This article presents a modern deterministic framework for the study of leading significant digit distributions in numerical data. Rather than relying on traditional probabilistic or mixture-based explanations, we demonstrate that the observed frequencies of leading digits are determined by the underlying arithmetic, algorithmic, and structural properties of the data-generating process. Our approach centers on a shift-invariant functional equation, whose general solution is given by explicit affine-plus-periodic formulas. This structural formulation explains the diversity of digit distributions encountered in both empirical and mathematical datasets, including cases with pronounced deviations from logarithmic or scale-invariant profiles. We systematically analyze digit distributions in finite and infinite datasets, address deterministic sequences such as prime numbers and recurrence relations, and highlight the emergence of block-structured and fractal features. The article provides critical examination of probabilistic models, explicit examples and counterexamples, and discusses limitations and open problems for further research. Overall, this work establishes a unified mathematical foundation for digital phenomena and offers a versatile toolset for modeling and analyzing digit patterns in applied and theoretical contexts.
Analyzing The Language of Visual Tokens
Chan, David M., Corona, Rodolfo, Park, Joonyong, Cho, Cheol Jun, Bai, Yutong, Darrell, Trevor
With the introduction of transformer-based models for vision and language tasks, such as LLaVA and Chameleon, there has been renewed interest in the discrete tokenized representation of images. These models often treat image patches as discrete tokens, analogous to words in natural language, learning joint alignments between visual and human languages. However, little is known about the statistical behavior of these visual languages - whether they follow similar frequency distributions, grammatical structures, or topologies as natural languages. In this paper, we take a natural-language-centric approach to analyzing discrete visual languages and uncover striking similarities and fundamental differences. We demonstrate that, although visual languages adhere to Zipfian distributions, higher token innovation drives greater entropy and lower compression, with tokens predominantly representing object parts, indicating intermediate granularity. We also show that visual languages lack cohesive grammatical structures, leading to higher perplexity and weaker hierarchical organization compared to natural languages. Finally, we demonstrate that, while vision models align more closely with natural languages than other models, this alignment remains significantly weaker than the cohesion found within natural languages. Through these experiments, we demonstrate how understanding the statistical properties of discrete visual languages can inform the design of more effective computer vision models.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (6 more...)
How Artists Improvise and Provoke Robotics
Benford, Steve, Garrett, Rachael, Schneiders, Eike, Tennent, Paul, Chamberlain, Alan, Avila, Juan, Brundell, Pat, Castle-Green, Simon
We explore transdisciplinary collaborations between artists and roboticists across a portfolio of artworks. Brendan Walker's Broncomatic was a breath controlled mechanical rodeo bull ride. Blast Theory's Cat Royale deployed a robot arm to play with a family of three cats for twelve days. Different Bodies is a prototype improvised dance performance in which dancers with disabilities physically manipulate two mirrored robot arms. We reflect on these to explore how artists shape robotics research through the two key strategies of improvisation and provocation. Artists are skilled at improvising extended robot experiences that surface opportunities for technology-focused design, but which also require researchers to improvise their research processes. Artists may provoke audiences into reflecting on the societal implications of robots, but at the same time challenge the established techno-centric concepts, methods and underlying epistemology of robotics research.
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- Europe > Sweden > Stockholm > Stockholm (0.04)
On the Detection of Anomalous or Out-Of-Distribution Data in Vision Models Using Statistical Techniques
O'Mahony, Laura, O'Sullivan, David JP, Nikolov, Nikola S.
Out-of-distribution data and anomalous inputs are vulnerabilities of machine learning systems today, often causing systems to make incorrect predictions. The diverse range of data on which these models are used makes detecting atypical inputs a difficult and important task. We assess a tool, Benford's law, as a method used to quantify the difference between real and corrupted inputs. We believe that in many settings, it could function as a filter for anomalous data points and for signalling out-of-distribution data. We hope to open a discussion on these applications and further areas where this technique is underexplored.
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Ireland (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
Crypto Wash Trading: Direct vs. Indirect Estimation
Falk, Brett Hemenway, Tsoukalas, Gerry, Zhang, Niuniu
Recent studies using indirect statistical methods estimate that around 70% of traded value on centralized crypto exchanges like Binance, can be characterized as wash trading. This paper turns to NFT markets, where transaction transparency, including analysis of roundtrip trades and common wallet activities, allows for more accurate direct estimation methods to be applied. We find roughly 30% of NFT volume and between 45-95% of traded value, involve wash trading. More importantly, our approach enables a critical evaluation of common indirect estimation methods used in the literature. We find major differences in their effectiveness; some failing entirely. Roundedness filters, like those used in Cong et al. (2023), emerge as the most accurate. In fact, the two approaches can be closely aligned via hyper-parameter optimization if direct data is available.
- North America > United States > Pennsylvania (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
- Banking & Finance > Trading (1.00)
- Information Technology > Services > e-Commerce Services (0.34)
Does human speech follow Benford's Law?
Researchers have observed that the frequencies of leading digits in many man-made and naturally occurring datasets follow a logarithmic curve, with digits that start with the number 1 accounting for $\sim 30\%$ of all numbers in the dataset and digits that start with the number 9 accounting for $\sim 5\%$ of all numbers in the dataset. This phenomenon, known as Benford's Law, is highly repeatable and appears in lists of numbers from electricity bills, stock prices, tax returns, house prices, death rates, lengths of rivers, and naturally occurring images. In this paper we demonstrate that human speech spectra also follow Benford's Law on average. That is, when averaged over many speakers, the frequencies of leading digits in speech magnitude spectra follow this distribution, although with some variability at the individual sample level. We use this observation to motivate a new set of features that can be efficiently extracted from speech and demonstrate that these features can be used to classify between human speech and synthetic speech.
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
- Asia > India (0.04)
3 Mathematical Laws Data Scientists Need To Know - KDnuggets
While a Data Scientist works with data as their main activity, it doesn't mean that mathematical knowledge is something we do not need. Data scientists need to learn and understand the mathematical theory behind machine learning to efficiently solving business problems. The mathematics behind machine learning is not just a random notation thrown here and there, but it consists of many theories and thoughts. This thought creates a lot of mathematical laws that contribute to the machine learning we can use right now. Although you could use the mathematics in any way you want to solve the problem, mathematical laws are not limited to machine learning after all.
Benford's law: what does it say on adversarial images?
Zago, João G., Baldissera, Fabio L., Antonelo, Eric A., Saad, Rodrigo T.
Convolutional neural networks (CNNs) are fragile to small perturbations in the input images. These networks are thus prone to malicious attacks that perturb the inputs to force a misclassification. Such slightly manipulated images aimed at deceiving the classifier are known as adversarial images. In this work, we investigate statistical differences between natural images and adversarial ones. More precisely, we show that employing a proper image transformation and for a class of adversarial attacks, the distribution of the leading digit of the pixels in adversarial images deviates from Benford's law. The stronger the attack, the more distant the resulting distribution is from Benford's law. Our analysis provides a detailed investigation of this new approach that can serve as a basis for alternative adversarial example detection methods that do not need to modify the original CNN classifier neither work on the raw high-dimensional pixels as features to defend against attacks.
- North America > Canada > Ontario > Toronto (0.14)
- South America > Brazil > Santa Catarina > Florianópolis (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
Unethical AI unfairly impacts protected classes - and everybody else as well
There are well-documented examples of AI systems making decisions that affect protected classes, such as housing assistance or unemployment benefits. AI can be used to screen resumes; banks apply AI models to grant individual consumers credit and set interest rates for them. Many small decisions, taken together, can have large effects, such as: AI-driven price discrimination could lead to certain groups in a society consistently paying more. But are there AI applications today that affect everyone, no matter their "class"? As I mentioned earlier, we are shifting our AI Ethics courses to more practical, useful techniques.
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.06)
- North America > United States > New Mexico > Eddy County > Carlsbad (0.06)
- North America > United States > Pennsylvania (0.05)
- (2 more...)
- Banking & Finance (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Law (0.91)