Concept-driven Off Policy Evaluation

Majumdar, Ritam, Teversham, Jack, Parbhoo, Sonali

Nov-28-2024–arXiv.org Machine Learning

Evaluating off-policy decisions using batch data poses significant challenges due to limited sample sizes leading to high variance. To improve Off-Policy Evaluation (OPE), we must identify and address the sources of this variance. Recent research on Concept Bottleneck Models (CBMs) shows that using human-explainable concepts can improve predictions and provide better understanding. We propose incorporating concepts into OPE to reduce variance. Our work introduces a family of concept-based OPE estimators, proving that they remain unbiased and reduce variance when concepts are known and predefined. Since real-world applications often lack predefined concepts, we further develop an end-to-end algorithm to learn interpretable, concise, and diverse parameterized concepts optimized for variance reduction. Our experiments with synthetic and real-world datasets show that both known and learned concept-based estimators significantly improve OPE performance. Crucially, we show that, unlike other OPE methods, concept-based estimators are easily interpretable and allow for targeted interventions on specific concepts, further enhancing the quality of these estimators.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

Nov-28-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - Israel (0.04)
- North America > United States
  - California > San Francisco County
    - San Francisco (0.14)
  - Massachusetts > Suffolk County
    - Boston (0.04)

Genre:
- Research Report > New Finding (0.65)

Industry:
- Health & Medicine > Therapeutic Area
  - Cardiology/Vascular Diseases (1.00)
  - Endocrinology (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Reinforcement Learning (0.67)
    - Statistical Learning (1.00)
  - Representation & Reasoning (1.00)