Counterfactual Explanations Can Be Manipulated

Oct-9-2024, 09:02:59 GMT–Neural Information Processing Systems

Counterfactual explanations are emerging as an attractive option for providing recourse to individuals adversely impacted by algorithmic decisions. As they are deployed in critical applications (e.g. law enforcement, financial lending), it becomes important to ensure that we clearly understand the vulnerabilties of these methods and find ways to address them. However, there is little understanding of the vulnerabilities and shortcomings of counterfactual explanations. In this work, we introduce the first framework that describes the vulnerabilities of counterfactual explanations and shows how they can be manipulated. More specifically, we show counterfactual explanations may converge to drastically different counterfactuals under a small perturbation indicating they are not robust.

counterfactual explanation, perturbation, recourse, (2 more...)

Neural Information Processing Systems

Oct-9-2024, 09:02:59 GMT

Conferences Web Page

Add feedback

Industry:
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.61)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)