Emergent LLM behaviors are observationally equivalent to data leakage

Barrie, Christopher, Törnberg, Petter

arXiv.org Artificial Intelligence 

Global convergence: Rapid convergence to a single, repeated action (a convention), maximizing joint and individual payoffs. Put simply, while the model does not explicitly identify this as a "naming game" setup, it does understand the basic structure of the scenario as well as optimal moves after success and what global convergence will look like. We conducted this analysis across a range of different LLMs. We then also used the OpenAI model gpt-4.1 to annotate three dimensions of the different LLM model outputs: whether it identified the setup as a coordination game; whether it correctly identified the optimal move; and whether it was able to correctly predict how the scenario would converge globally. We also asked the model to output the text snippet from the model output of the given LLM that the OpenAI model used to justify its decision.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found