Learning Pseudorandom Numbers with Transformers: Permuted Congruential Generators, Curricula, and Interpretability

Oct-31-2025–arXiv.org Artificial Intelligence

We study the ability of Transformer models to learn sequences generated by Permuted Congruential Generators (PCGs), a widely used family of pseudo-random number generators (PRNGs). PCGs introduce substantial additional difficulty over linear congruential generators (LCGs) by applying a series of bit-wise shifts, XORs, rotations and truncations to the hidden state. We show that Transformers can nevertheless successfully perform in-context prediction on unseen sequences from diverse PCG variants, in tasks that are beyond published classical attacks. Surprisingly, we find even when the output is truncated to a single bit, it can be reliably predicted by the model. When multiple distinct PRNGs are presented together during training, the model can jointly learn them, identifying structures from different permutations. We demonstrate a scaling law with modulus m: the number of in-context sequence elements required for near-perfect prediction grows as m. Finally, we analyze embedding layers and uncover a novel clustering phenomenon: the model spontaneously groups the integer inputs into bitwise rotationally-invariant clusters, revealing how representations can transfer from smaller to larger moduli. Transformer-based models have achieved remarkable success across language, vision, and algorithmic tasks, demonstrating an ability to capture complex patterns from large-scale data (V aswani et al., 2023; Dosovitskiy et al., 2021). Beyond supervised training, they can acquire new behaviors directly from examples provided in the input, a phenomenon known as in-context learning (Brown et al., 2020; Olsson et al., 2022). Despite these successes, fundamental questions remain: what kinds of patterns can Transformers reliably learn, how can we train them efficiently and what mechanisms underlie their ability to generalize? To address these questions, we use pseudo-random number generators (PRNGs) as a controlled benchmark.

large language model, machine learning, pattern recognition, (22 more...)

arXiv.org Artificial Intelligence

Oct-31-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.34)
  - Machine Learning
    - Neural Networks > Deep Learning (0.34)
    - Pattern Recognition (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found