Not Yet AlphaFold for the Mind: Evaluating Centaur as a Synthetic Participant

Namazova, Sabrina, Brondetta, Alessandra, Strittmatter, Younes, Nassar, Matthew, Musslick, Sebastian

Aug-12-2025–arXiv.org Artificial Intelligence

Simulators have revolutionized scientific practice across the natural sciences. By generating data that reliably approximate real-world phenomena, they enable scientists to accelerate hypothesis testing and optimize experimental designs [1, 2]. This is perhaps best illustrated by AlphaFold, a Nobel-prize winning simulator in chemistry that predicts protein structures from amino acid sequences, enabling rapid prototyping of molecular interactions, drug targets, and protein functions [1]. In the behavioral sciences, a reliable participant simulator--a system capable of producing human-like behavior across cognitive tasks--would represent a similarly transformative advance [3]. Recently, Binz et al. introduced Centaur, a large language model (LLM) fine-tuned on human data from 160 experiments, proposing its use not only as a model of cognition but also as a participant simulator for "in silico prototyping of experimental studies" [4], e.g., to advance automated cognitive science [3, 5]. Although Centaur demonstrates strong predictive accuracy, its generative behavior-- a critical criterion for a participant simulator--systematically diverges from human data. This suggests that, while Centaur is a significant step toward predicting human behavior, it does not yet meet the standards of a reliable participant simulator or an accurate model of cognition. A core criterion for any behavioral simulator is its ability to generate behavioral patterns observed in experiments.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Aug-12-2025

arXiv.org PDF

Add feedback

Country:
- Europe
  - Germany (0.04)
  - United Kingdom > England
    - Oxfordshire > Oxford (0.04)
- North America > United States
  - Iowa (0.04)
  - Wisconsin (0.06)

Genre:
- Research Report > New Finding (0.48)

Industry:
- Health & Medicine
  - Pharmaceuticals & Biotechnology (0.88)
  - Therapeutic Area > Neurology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science > Cognitive Architectures (0.75)
  - Machine Learning > Performance Analysis
    - Accuracy (0.35)
  - Natural Language > Large Language Model (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found