Review for NeurIPS paper: Generative causal explanations of black-box classifiers

Jan-23-2025, 13:49:17 GMT–Neural Information Processing Systems

This paper presents a generative model to "explain" any given black-box classifier and its training dataset. Explanation is through a hidden factor that can control or intervene in the output of the classifier. The discovery is based on a objective with two terms: 1) a proposed Information Flow that denotes the causal effect from the hidden factor to the classifier output and 2) a distribution similarity to impose the discovered hidden factor can generate back the feature space. Reviewers found this a borderline paper. After the discussion phase all reviewers are leaning towards acceptance. They pointed out as strengths that this is a very well-written paper, presenting a simple yet effective method, with extensive ablative experiments.

black-box classifier, classifier, generative causal explanation, (4 more...)

Neural Information Processing Systems

Jan-23-2025, 13:49:17 GMT

Conferences Web Page

Add feedback

Industry:
- Transportation > Air (0.63)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.59)