AutoStan: Autonomous Bayesian Model Improvement via Predictive Feedback

Mar-31-2026–arXiv.org Machine Learning

We present AutoStan, a framework in which a command-line interface (CLI) coding agent autonomously builds and iteratively improves Bayesian models written in Stan. The agent operates in a loop, writing a Stan model file, executing MCMC sampling, then deciding whether to keep or revert each change based on two complementary feedback signals: the negative log predictive density (NLPD) on held-out data and the sampler's own diagnostics (divergences, R-hat, effective sample size). We evaluate AutoStan on five datasets with diverse modeling structures. On a synthetic regression dataset with outliers, the agent progresses from naive linear regression to a model with Student-t robustness, nonlinear heteroscedastic structure, and an explicit contamination mixture, matching or outperforming TabPFN, a state-of-the-art black-box method, while remaining fully interpretable. Across four additional experiments, the same mechanism discovers hierarchical partial pooling, varying-slope models with correlated random effects, and a Poisson attack/defense model for soccer. No search algorithm, critic module, or domain-specific instructions are needed. This is, to our knowledge, the first demonstration that a CLI coding agent can autonomously write and iteratively improve Stan code for diverse Bayesian modeling problems.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

Mar-31-2026

arXiv.org PDF

Add feedback

Country:
- Europe
  - Switzerland (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Germany > Baden-Württemberg
    - Freiburg (0.04)

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment > Sports > Soccer (0.49)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.70)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.69)
  - Machine Learning
    - Statistical Learning (0.88)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.55)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found