Model-agnostic and Scalable Counterfactual Explanations via Reinforcement Learning

Samoilescu, Robert-Florian, Van Looveren, Arnaud, Klaise, Janis

Jun-4-2021–arXiv.org Machine Learning

Counterfactual instances are a powerful tool to obtain valuable insights into automated decision processes, describing the necessary minimal changes in the input space to alter the prediction towards a desired target. Most previous approaches require a separate, computationally expensive optimization procedure per instance, making them impractical for both large amounts of data and high-dimensional data. Moreover, these methods are often restricted to certain subclasses of machine learning models (e.g. differentiable or tree-based models). In this work, we propose a deep reinforcement learning approach that transforms the optimization procedure into an end-to-end learnable process, allowing us to generate batches of counterfactual instances in a single forward pass. Our experiments on real-world data show that our method i) is model-agnostic (does not assume differentiability), relying only on feedback from model predictions; ii) allows for generating target-conditional counterfactual instances; iii) allows for flexible feature range constraints for numerical and categorical attributes, including the immutability of protected features (e.g. gender, race); iv) is easily extended to other data modalities such as images.

counterfactual, dataset, model-agnostic and scalable counterfactual explanation, (11 more...)

arXiv.org Machine Learning

Jun-4-2021

arXiv.org PDF

Add feedback

Country:
- South America (0.04)
- North America
  - Central America (0.04)
  - United States
    - New York > New York County
      - New York City (0.04)
    - Massachusetts > Plymouth County
      - Hanover (0.04)
    - California > San Francisco County
      - San Francisco (0.14)
  - Puerto Rico > San Juan
    - San Juan (0.04)
- Europe > United Kingdom
  - England > Greater London > London (0.04)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine > Therapeutic Area (0.31)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Statistical Learning > Regression (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found