Large Language Models and Emergence: A Complex Systems Perspective
Krakauer, David C., Krakauer, John W., Mitchell, Melanie
–arXiv.org Artificial Intelligence
Large Language Models (LLMs) are deep neural networks that, through training on huge amounts of text, learn to accurately predict the next word (or token) in a text. It has been surprising to many that next-token prediction has lead to impressive abilities, such as learning of syntax, code generation, writing in any style, and factual recall. It has been claimed in the LLM literature that, as the number of network parameters and amount of training data is scaled up, certain capabilities arise suddenly and unexpectedly, a phenomenon that these writers term "emergence". For example, Wei et al. [1] write, "we define emergent abilities of large language models as abilities that are not present in smaller-scale models but are present in large-scale models; thus they cannot be predicted by simply extrapolating the performance improvements on smaller-scale models." And in a recent review of emergent abilities in LLMs Berti et al. [2] survey around 100 papers the majority of which equate emergence with the discontinuous appearance of abilities with increasing data or model size.
arXiv.org Artificial Intelligence
Jun-16-2025
- Country:
- Europe
- Netherlands > South Holland
- Dordrecht (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.14)
- Netherlands > South Holland
- North America > United States
- New Jersey > Mercer County
- Princeton (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- New York (0.04)
- New Jersey > Mercer County
- Europe
- Genre:
- Research Report (0.50)
- Industry:
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- Technology: