Reply to "Emergent LLM behaviors are observationally equivalent to data leakage"

Ashery, Ariel Flint, Aiello, Luca Maria, Baronchelli, Andrea

Jun-24-2025–arXiv.org Artificial Intelligence

Reply to "Emergent LLM behaviors are observationally equivalent to data leakage" Abstract A potential concern when simulating populations of large language models (LLMs) is data contamination, i.e. the possibility that training data may shape outcomes in unintended ways. While this concern is important and may hinder certain experiments with multi-agent models, it does not preclude the study of genuinely emergent dynamics in LLM populations. The recent critique by Barrie and T ornberg [1] of the results of Flint Ashery et al. [2] offers an opportunity to clarify that self-organisation and model-dependent emergent dynamics can be studied in LLM populations, highlighting how such dynamics have been empirically observed in the specific case of social conventions. Barrie & T ornberg [1] question whether the emergence of conventions observed in our recent study of interacting large language models (LLMs) [2] can be attributed to genuine collective dynamics, or instead result from data leakage from the models' training data. In this note, we respond to their main points and argue that the observed dynamics cannot be explained by data contamination alone.

convention, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

Jun-24-2025

arXiv.org PDF

Add feedback

Country:
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.05)
    - Greater London > London (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.05)

Genre:
- Research Report > New Finding (0.69)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found