Dervovic, Danial
Counterfactual Shapley Additive Explanations
Albini, Emanuele, Long, Jason, Dervovic, Danial, Magazzeni, Daniele
Feature attributions are a common paradigm for model explanations due to their simplicity in assigning a single numeric score for each input feature to a model. In the actionable recourse setting, wherein the goal of the explanations is to improve outcomes for model consumers, it is often unclear how feature attributions should be correctly used. With this work, we aim to strengthen and clarify the link between actionable recourse and feature attributions. Concretely, we propose a variant of SHAP, CoSHAP, that uses counterfactual generation techniques to produce a background dataset for use within the marginal (a.k.a. interventional) Shapley value framework. We motivate the need within the actionable recourse setting for careful consideration of background datasets when using Shapley values for feature attributions, alongside the requirement for monotonicity, with numerous synthetic examples. Moreover, we demonstrate the efficacy of CoSHAP by proposing and justifying a quantitative score for feature attributions, counterfactual-ability, showing that as measured by this metric, CoSHAP is superior to existing methods when evaluated on public datasets using monotone tree ensembles.
Tradeoffs in Streaming Binary Classification under Limited Inspection Resources
Hassanzadeh, Parisa, Dervovic, Danial, Assefa, Samuel, Reddy, Prashant, Veloso, Manuela
Institutions are increasingly relying on machine learning models Given the imbalanced nature of data in this domain, which makes to identify and alert on abnormal events, such as fraud, cyber attacks learning classifiers that efficiently discriminate among the minority and system failures. These alerts often need to be manually and majority class difficult, and the limited resources available investigated by specialists. Given the operational cost of manual inspections, for inspecting time-sensitive risky events, we are interested in understanding the suspicious events are selected by alerting systems with the relationship between the rate of detection from the carefully designed thresholds. In this paper, we consider an imbalanced minority class (i.e., the fraction of samples from the minority class binary classification problem, where events arrive sequentially selected for inspection) and the inspection budget. Specifically, we and only a limited number of suspicious events can be inspected. We focus on applications that involve real-time processing and decisionmaking model the event arrivals as a non-homogeneous Poisson process, and where an abnormal event can only be inspected at the time compare various suspicious event selection methods including those of arrival, and we investigate how different selection policies based based on static and adaptive thresholds. For each method, we analytically on classifier predictions operate in terms of the limited inspection characterize the tradeoff between the minority-class detection budget rather than the decision threshold.
Counterfactual Explanations for Arbitrary Regression Models
Spooner, Thomas, Dervovic, Danial, Long, Jason, Shepard, Jon, Chen, Jiahao, Magazzeni, Daniele
We present a new method for counterfactual explanations (CFEs) based on Bayesian optimisation that applies to both classification and regression models. Our method is a globally convergent search algorithm with support for arbitrary regression models and constraints like feature sparsity and actionable recourse, and furthermore can answer multiple counterfactual questions in parallel while learning from previous queries. We formulate CFE search for regression models in a rigorous mathematical framework using differentiable potentials, which resolves robustness issues in threshold-based objectives. We prove that in this framework, (a) verifying the existence of counterfactuals is NP-complete; and (b) that finding instances using such potentials is CLS-complete. We describe a unified algorithm for CFEs using a specialised acquisition function that composes both expected improvement and an exponential-polynomial (EP) family with desirable properties. Our evaluation on real-world benchmark domains demonstrate high sample-efficiency and precision.
Non-Parametric Stochastic Sequential Assignment With Random Arrival Times
Dervovic, Danial, Hassanzadeh, Parisa, Assefa, Samuel, Reddy, Prashant
We consider a problem wherein jobs arrive at random times and assume random values. Upon each job arrival, the decision-maker must decide immediately whether or not to accept the job and gain the value on offer as a reward, with the constraint that they may only accept at most $n$ jobs over some reference time period. The decision-maker only has access to $M$ independent realisations of the job arrival process. We propose an algorithm, Non-Parametric Sequential Allocation (NPSA), for solving this problem. Moreover, we prove that the expected reward returned by the NPSA algorithm converges in probability to optimality as $M$ grows large. We demonstrate the effectiveness of the algorithm empirically on synthetic data and on public fraud-detection datasets, from where the motivation for this work is derived.