eclaire
CGXplain: Rule-Based Deep Neural Network Explanations Using Dual Linear Programs
Hemker, Konstantin, Shams, Zohreh, Jamnik, Mateja
Rule-based surrogate models are an effective and interpretable way to approximate a Deep Neural Network's (DNN) decision boundaries, allowing humans to easily understand deep learning models. Current state-of-the-art decompositional methods, which are those that consider the DNN's latent space to extract more exact rule sets, manage to derive rule sets at high accuracy. However, they a) do not guarantee that the surrogate model has learned from the same variables as the DNN (alignment), b) only allow optimising for a single objective, such as accuracy, which can result in excessively large rule sets (complexity), and c) use decision tree algorithms as intermediate models, which can result in different explanations for the same DNN (stability). This paper introduces Column Generation eXplainer to address these limitations - a decompositional method using dual linear programming to extract rules from the hidden representations of the DNN. This approach allows optimising for any number of objectives and empowers users to tweak the explanation model to their needs. We evaluate our results on a wide variety of tasks and show that CGX meets all three criteria, by having exact reproducibility of the explanation model that guarantees stability and reduces the rule set size by >80% (complexity) at improved accuracy and fidelity across tasks (alignment). In spite of state-of-the-art performance, the opaqueness and lack of explainability of DNNs has impeded their wide adoption in safety-critical domains such as healthcare or clinical decision-making.
Efficient Decompositional Rule Extraction for Deep Neural Networks
Zarlenga, Mateo Espinosa, Shams, Zohreh, Jamnik, Mateja
In recent years, there has been significant work on increasing both interpretability and debuggability of a Deep Neural Network (DNN) by extracting a rule-based model that approximates its decision boundary. Nevertheless, current DNN rule extraction methods that consider a DNN's latent space when extracting rules, known as decompositional algorithms, are either restricted to single-layer DNNs or intractable as the size of the DNN or data grows. In this paper, we address these limitations by introducing ECLAIRE, a novel polynomial-time rule extraction algorithm capable of scaling to both large DNN architectures and large training datasets. We evaluate ECLAIRE on a wide variety of tasks, ranging from breast cancer prognosis to particle detection, and show that it consistently extracts more accurate and comprehensible rule sets than the current state-of-the-art methods while using orders of magnitude less computational resources.