Law
Combining Experts’ Causal Judgments
Alrajeh, Dalal ( Imperial College London ) | Chockler, Hana (King's College London) | Halpern, Joseph Yehuda (Cornell University)
Consider a policymaker who wants to decide which intervention to perform in order to change a currently undesirable situation. The policymaker has at her disposal a team of experts, each with their own understanding of the causal dependencies between different factors contributing to the outcome. The policymaker has varying degrees of confidence in the experts’ opinions. She wants to combine their opinions in order to decide on the most effective intervention. We formally define the notion of an effective intervention, and then consider how experts’ causal judgments can be combined in order to determine the most effective intervention. We define a notion of two causal models being compatible , and show how compatible causal models can be combined. We then use it as the basis for combining experts causal judgments. We illustrate our approach on a number of real-life examples.
Deep Learning for Case-Based Reasoning Through Prototypes: A Neural Network That Explains Its Predictions
Li, Oscar (Duke University) | Liu, Hao (Nanjing University) | Chen, Chaofan (Duke University) | Rudin, Cynthia (Duke University)
Deep neural networks are widely used for classification. These deep models often suffer from a lack of interpretability---they are particularly difficult to understand because of their non-linear nature. As a result, neural networks are often treated as "black box" models, and in the past, have been trained purely to optimize the accuracy of predictions. In this work, we create a novel network architecture for deep learning that naturally explains its own reasoning for each prediction. This architecture contains an autoencoder and a special prototype layer, where each unit of that layer stores a weight vector that resembles an encoded training input. The encoder of the autoencoder allows us to do comparisons within the latent space, while the decoder allows us to visualize the learned prototypes. The training objective has four terms: an accuracy term, a term that encourages every prototype to be similar to at least one encoded input, a term that encourages every encoded input to be close to at least one prototype, and a term that encourages faithful reconstruction by the autoencoder. The distances computed in the prototype layer are used as part of the classification process. Since the prototypes are learned during training, the learned network naturally comes with explanations for each prediction, and the explanations are loyal to what the network actually computes.
Predictive Coding Machine for Compressed Sensing and Image Denoising
Li, Jun (Northeastern University) | Liu, Hongfu (Northeastern University) | Fu, Yun (Northeastern University)
Sparse and low rank coding has widely received much attention in machine learning, multimedia and computer vision. Unfortunately, expensive inference restricts the power of coding models in real-world applications, e.g., compressed sensing and image deblurring. In order to avoid the expensive inference, we propose a predictive coding machine (PCM) which aims to train a deep neural network (DNN) encoder to approximate the codes. By this means, a test sample can be fast approximated by the well-trained DNN. However, DNN leads PCM to be a non-convex and non-smooth optimization problem, which is extremely hard to solve. To address this challenge, we extend accelerated proximal gradient for PCM by steering gradient descent of DNN. To the best of our knowledge, we are the first to propose a gradient descent algorithm guided by accelerated proximal gradient for solving the PCM problem. Besides, a sufficient condition is provided to ensure the convergence to a critical point. Moreover, when the coding models are convex in PCM, the convergence rate O (1/( m 2 √ t )) can be held in which m is the iteration number of accelerated proximal gradient, and t is the epoch of training DNN. Numerical results verify the promising advantages of PCM in terms of effectiveness, efficiency and robustness.
Non-Discriminatory Machine Learning Through Convex Fairness Criteria
Goel, Naman (EPFL, Lausanne) | Yaghini, Mohammad (EPFL, Lausanne) | Faltings, Boi (EPFL, Lausanne)
Biased decision making by machine learning systems is increasingly recognized as an important issue. Recently, techniques have been proposed to learn non-discriminatory clas- sifiers by enforcing constraints in the training phase. Such constraints are either non-convex in nature (posing computational difficulties) or don’t have a clear probabilistic interpretation. Moreover, the techniques offer little understanding of the more subjective notion of fairness. In this paper, we introduce a novel technique to achieve non-discrimination without sacrificing convexity and probabilistic interpretation. Our experimental analysis demonstrates the success of the method on popular real datasets including ProPublica’s COMPAS dataset. We also propose a new notion of fairness for machine learning and show that our technique satisfies this subjective fairness criterion.
Neural Ideal Point Estimation Network
Song, Kyungwoo (Korea Advanced Institute of Science and Technology) | Lee, Wonsung (Korea Advanced Institute of Science and Technology) | Moon, Il-Chul (Korea Advanced Institute of Science and Technology)
Understanding politics is challenging because the politics take the influence from everything. Even we limit ourselves to the political context in the legislative processes; we need a better understanding of latent factors, such as legislators, bills, their ideal points, and their relations. From the modeling perspective, this is difficult 1) because these observations lie in a high dimension that requires learning on low dimensional representations, and 2) because these observations require complex probabilistic modeling with latent variables to reflect the causalities. This paper presents a new model to reflect and understand this political setting, NIPEN, including factors mentioned above in the legislation. We propose two versions of NIPEN: one is a hybrid model of deep learning and probabilistic graphical model, and the other model is a neural tensor model. Our result indicates that NIPEN successfully learns the manifold of the legislative bill's text, and NIPEN utilizes the learned low-dimensional latent variables to increase the prediction performance of legislators' votings. Additionally, by virtue of being a domain-rich probabilistic model, NIPEN shows the hidden strength of the legislators' trust network and their various characteristics on casting votes.
Fairness in Decision-Making — The Causal Explanation Formula
Zhang, Junzhe (Purdue University) | Bareinboim, Elias (Purdue University)
AI plays an increasingly prominent role in society since decisions that were once made by humans are now delegated to automated systems. These systems are currently in charge of deciding bank loans, criminals' incarceration, and the hiring of new employees, and it's not difficult to envision that they will in the future underpin most of the decisions in society. Despite the high complexity entailed by this task, there is still not much understanding of basic properties of such systems. For instance, we currently cannot detect (neither explain nor correct) whether an AI system can be deemed fair (i.e., is abiding by the decision-constraints agreed by society) or it is reinforcing biases and perpetuating a preceding prejudicial practice. Issues of discrimination have been discussed extensively in political and legal circles, but there exists still not much understanding of the formal conditions that a system must meet to be deemed fair. In this paper, we use the language of structural causality (Pearl, 2000) to fill in this gap. We start by introducing three new fine-grained measures of transmission of change from stimulus to effect, which we called counterfactual direct (Ctf-DE), indirect (Ctf-IE), and spurious (Ctf-SE) effects. We then derive what we call the causal explanation formula, which allows the AI designer to quantitatively evaluate fairness and explain the total observed disparity of decisions through different discriminatory mechanisms. We apply these measures to various discrimination analysis tasks and run extensive simulations, including detection, evaluation, and optimization of decision-making under fairness constraints. We conclude studying the trade-off between different types of fairness criteria (outcome and procedural), and provide a quantitative approach to policy implementation and the design of fair AI systems.
Fair Inference on Outcomes
Nabi, Razieh (Johns Hopkins University) | Shpitser, Ilya (Johns Hopkins University)
In this paper, we consider the problem of fair statistical inference involving outcome variables. Examples include classification and regression problems, and estimating treatment effects in randomized trials or observational data. The issue of fairness arises in such problems where some covariates or treatments are "sensitive," in the sense of having potential of creating discrimination. In this paper, we argue that the presence of discrimination can be formalized in a sensible way as the presence of an effect of a sensitive covariate on the outcome along certain causal pathways, a view which generalizes (Pearl 2009). A fair outcome model can then be learned by solving a constrained optimization problem. We discuss a number of complications that arise in classical statistical inference due to this view and provide workarounds based on recent work in causal and semi-parametric inference.
Weighted Abstract Dialectical Frameworks
Brewka, Gerhard (Leipzig University) | Strass, Hannes (Leipzig University) | Wallner, Johannes P. (TU Wien) | Woltran, Stefan (TU Wien)
Abstract Dialectical Frameworks (ADFs) generalize Dung's argumentation frameworks allowing various relationships among arguments to be expressed in a systematic way. We further generalize ADFs so as to accommodate arbitrary acceptance degrees for the arguments. This makes ADFs applicable in domains where both the initial status of arguments and their relationship are only insufficiently specified by Boolean functions. We define all standard ADF semantics for the weighted case, including grounded, preferred and stable semantics. We illustrate our approach using acceptance degrees from the unit interval and show how other valuation structures can be integrated. In each case it is sufficient to specify how the generalized acceptance conditions are represented by formulas, and to specify the information ordering underlying the characteristic ADF operator. We also present complexity results for problems related to weighted ADFs.
Weighted Voting Via No-Regret Learning
Haghtalab, Nika (Carnegie Mellon University) | Noothigattu, Ritesh (Carnegie Mellon University) | Procaccia, Ariel D. (Carnegie Mellon University)
Voting systems typically treat all voters equally. We argue that perhaps they should not: Voters who have supported good choices in the past should be given higher weight than voters who have supported bad ones. To develop a formal framework for desirable weighting schemes, we draw on no-regret learning. Specifically, given a voting rule, we wish to design a weighting scheme such that applying the voting rule, with voters weighted by the scheme, leads to choices that are almost as good as those endorsed by the best voter in hindsight. We derive possibility and impossibility results for the existence of such weighting schemes, depending on whether the voting rule and the weighting scheme are deterministic or randomized, as well as on the social choice axioms satisfied by the voting rule.
Beyond Distributive Fairness in Algorithmic Decision Making: Feature Selection for Procedurally Fair Learning
Grgić-Hlača, Nina (Max Planck Institute for Software Systems (MPI-SWS)) | Zafar, Muhammad Bilal (Max Planck Institute for Software Systems (MPI-SWS)) | Gummadi, Krishna P. (Max Planck Institute for Software Systems (MPI-SWS)) | Weller, Adrian (University of Cambridge)
With widespread use of machine learning methods in numerous domains involving humans, several studies have raised questions about the potential for unfairness towards certain individuals or groups. A number of recent works have proposed methods to measure and eliminate unfairness from machine learning models. However, most of this work has focused on only one dimension of fair decision making: distributive fairness, i.e., the fairness of the decision outcomes. In this work, we leverage the rich literature on organizational justice and focus on another dimension of fair decision making: procedural fairness, i.e., the fairness of the decision making process. We propose measures for procedural fairness that consider the input features used in the decision process, and evaluate the moral judgments of humans regarding the use of these features. We operationalize these measures on two real world datasets using human surveys on the Amazon Mechanical Turk (AMT) platform, demonstrating that our measures capture important properties of procedurally fair decision making. We provide fast submodular mechanisms to optimize the tradeoff between procedural fairness and prediction accuracy. On our datasets, we observe empirically that procedural fairness may be achieved with little cost to outcome fairness, but that some loss of accuracy is unavoidable.