Adversarial Counterfactual Environment Model Learning

Apr-30-2026, 01:18:40 GMT–Neural Information Processing Systems

An accurate environment dynamics model is crucial for various downstream tasks in sequential decision-making, such as counterfactual prediction, off-policy evaluation, and offline reinforcement learning. Currently, these models were learned through empirical risk minimization (ERM) by step-wise fitting of historical transition data. This way was previously believed unreliable over long-horizon rollouts because of the compounding errors, which can lead to uncontrollable inaccuracies in predictions. In this paper, we find that the challenge extends beyond just longterm prediction errors: we reveal that even when planning with one step, learned dynamics models can also perform poorly due to the selection bias of behavior policies during data collection.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Apr-30-2026, 01:18:40 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (1.00)
- Europe (1.00)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Health & Medicine (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.67)

Duplicate Docs Excel Report

Title
df927a06a0d9f5f06d9cd4a91ce58e56-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found