AI Testing Should Account for Sophisticated Strategic Behaviour

Kovarik, Vojtech, Chen, Eric Olav, Petersen, Sami, Ghersengorin, Alexis, Conitzer, Vincent

Aug-22-2025–arXiv.org Artificial Intelligence

This position paper argues for two claims regarding AI testing and evaluation. First, to remain informative about deployment behaviour, evaluations need account for the possibility that AI systems understand their circumstances and reason strategically. Second, game-theoretic analysis can inform evaluation design by formalising and scrutinising the reasoning in evaluation-based safety cases. Drawing on examples from existing AI systems, a review of relevant research, and formal strategic analysis of a stylised evaluation scenario, we present evidence for these claims and motivate several research directions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Aug-22-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.14)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)
- Leisure & Entertainment > Games (0.94)

Technology:
- Information Technology
  - Game Theory (1.00)
  - Artificial Intelligence
    - Natural Language > Large Language Model (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (0.69)