A dataset of questions on decision-theoretic reasoning in Newcomb-like problems

Oesterheld, Caspar, Cooper, Emery, Kodama, Miles, Nguyen, Linh Chi, Perez, Ethan

Dec-15-2024–arXiv.org Artificial Intelligence

We introduce a dataset of natural-language questions in the decision theory of so-called Newcomb-like problems. Newcomb-like problems include, for instance, decision problems in which an agent interacts with a similar other agent, and thus has to reason about the fact that the other agent will likely reason in similar ways. Evaluating LLM reasoning about Newcomb-like problems is important because interactions between foundation-model-based agents will often be Newcomb-like. Some ways of reasoning about Newcomb-like problems may allow for greater cooperation between models. Our dataset contains both capabilities questions (i.e., questions with a unique, uncontroversially correct answer) and attitude questions (i.e., questions about which decision theorists would disagree). We use our dataset for an investigation of decision-theoretical capabilities and expressed attitudes and their interplay in existing models (different models by OpenAI, Anthropic, Meta, GDM, Reka, etc.), as well as models under simple prompt-based interventions. We find, among other things, that attitudes vary significantly between existing models; that high capabilities are associated with attitudes more favorable toward so-called evidential decision theory; and that attitudes are consistent across different types of questions.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

Dec-15-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - Israel (0.04)
  - Syria > Damascus Governorate
    - Damascus (0.04)
- Europe
  - Netherlands > South Holland
    - Dordrecht (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
    - Oxfordshire > Oxford (0.04)
- North America
  - Mexico > Mexico City
    - Mexico City (0.04)
  - United States
    - New York > New York County
      - New York City (0.04)
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.04)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Leisure & Entertainment > Sports (0.67)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (1.00)
    - Natural Language
      - Chatbot (1.00)
      - Large Language Model (1.00)
  - Game Theory (1.00)