Performance Analysis
Operationalizing Individual Fairness with Pairwise Fair Representations
Lahoti, Preethi, Gummadi, Krishna P., Weikum, Gerhard
We revisit the notion of individual fairness proposed by Dwork et al. A central challenge in operationalizing their approach is the difficulty in eliciting a human specification of a similarity metric. In this paper, we propose an operationalization of individual fairness that does not rely on a human specification of a distance metric. Instead, we propose novel approaches to elicit and leverage side-information on equally deserving individuals to counter subordination between social groups. We model this knowledge as a fairness graph, and learn a unified Pairwise Fair Representation(PFR) of the data that captures both data-driven similarity between individuals and the pairwise side-information in fairness graph. We elicit fairness judgments from a variety of sources, including humans judgments for two real-world datasets on recidivism prediction (COMPAS) and violent neighborhood prediction (Crime & Communities). Our experiments show that the PFR model for operationalizing individual fairness is practically viable.
Generalizing from a few environments in safety-critical reinforcement learning
Kenton, Zachary, Filos, Angelos, Evans, Owain, Gal, Yarin
Before deploying autonomous agents in the real world, we need to be confident they will perform safely in novel situations. Ideally, we would expose agents to a very wide range of situations during training, allowing them to learn about every possible danger, but this is often impractical. This paper investigates safety and generalization from a limited number of training environments in deep reinforcement learning (RL). We find RL algorithms can fail dangerously on unseen test environments even when performing perfectly on training environments. Firstly, in a gridworld setting, we show that catastrophes can be significantly reduced with simple modifications, including ensemble model averaging and the use of a blocking classifier. In the more challenging CoinRun environment we find similar methods do not significantly reduce catastrophes. However, we do find that the uncertainty information from the ensemble is useful for predicting whether a catastrophe will occur within a few steps and hence whether human intervention should be requested.
Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence
Strรถm, Peter, Kartasalo, Kimmo, Olsson, Henrik, Solorzano, Leslie, Delahunt, Brett, Berney, Daniel M., Bostwick, David G., Evans, Andrew J., Grignon, David J., Humphrey, Peter A., Iczkowski, Kenneth A., Kench, James G., Kristiansen, Glen, van der Kwast, Theodorus H., Leite, Katia R. M., McKenney, Jesse K., Oxley, Jon, Pan, Chin-Chen, Samaratunga, Hemamali, Srigley, John R., Takahashi, Hiroyuki, Tsuzuki, Toyonori, Varma, Murali, Zhou, Ming, Lindberg, Johan, Bergstrรถm, Cecilia, Ruusuvuori, Pekka, Wรคhlby, Carolina, Grรถnberg, Henrik, Rantalainen, Mattias, Egevad, Lars, Eklund, Martin
Background: An increasing volume of prostate biopsies and a world-wide shortage of uro-pathologists puts a strain on pathology departments. Additionally, the high intra- and inter-observer variability in grading can result in over- and undertreatment of prostate cancer. Artificial intelligence (AI) methods may alleviate these problems by assisting pathologists to reduce workload and harmonize grading. Methods: We digitized 6,682 needle biopsies from 976 participants in the population based STHLM3 diagnostic study to train deep neural networks for assessing prostate biopsies. The networks were evaluated by predicting the presence, extent, and Gleason grade of malignant tissue for an independent test set comprising 1,631 biopsies from 245 men. We additionally evaluated grading performance on 87 biopsies individually graded by 23 experienced urological pathologists from the International Society of Urological Pathology. We assessed discriminatory performance by receiver operating characteristics (ROC) and tumor extent predictions by correlating predicted millimeter cancer length against measurements by the reporting pathologist. We quantified the concordance between grades assigned by the AI and the expert urological pathologists using Cohen's kappa. Results: The performance of the AI to detect and grade cancer in prostate needle biopsy samples was comparable to that of international experts in prostate pathology. The AI achieved an area under the ROC curve of 0.997 for distinguishing between benign and malignant biopsy cores, and 0.999 for distinguishing between men with or without prostate cancer. The correlation between millimeter cancer predicted by the AI and assigned by the reporting pathologist was 0.96. For assigning Gleason grades, the AI achieved an average pairwise kappa of 0.62. This was within the range of the corresponding values for the expert pathologists (0.60 to 0.73).
Knowledge Graph Embedding for Ecotoxicological Effect Prediction
Myklebust, Erik B., Jimenez-Ruiz, Ernesto, Chen, Jiaoyan, Wolf, Raoul, Tollefsen, Knut Erik
Exploring the effects a chemical compound has on a species takes a considerable experimental effort. Appropriate methods for estimating and suggesting new effects can dramatically reduce the work needed to be done by a laboratory. In this paper we explore the suitability of using a knowledge graph embedding approach for ecotoxicological effect prediction. A knowledge graph has been constructed from publicly available data sets, including a species taxonomy and chemical classification and similarity. The publicly available effect data is integrated to the knowledge graph using ontology alignment techniques. Our experimental results show that the knowledge graph based approach improves the selected baselines.
The Cost of a Reductions Approach to Private Fair Optimization
We examine a reductions approach to fair optimization and learning where a black-box optimizer is used to learn a fair model for classification or regression [Alabi et al., 2018, Agarwal et al., 2018] and explore the creation of such fair models that adhere to data privacy guarantees (specifically differential privacy). For this approach, we consider two suites of use cases: the first is for optimizing convex performance measures of the confusion matrix (such as $G$-mean and $H$-mean); the second is for satisfying statistical definitions of algorithmic fairness (such as equalized odds, demographic parity, and the gini index of inequality). The reductions approach to fair optimization can be abstracted as the constrained group-objective optimization problem where we aim to optimize an objective that is a function of losses of individual groups, subject to some constraints. We present two differentially private algorithms: an $(\epsilon, 0)$ exponential sampling algorithm and an $(\epsilon, \delta)$ algorithm that uses a linear optimizer to incrementally move toward the best decision. We analyze the privacy and utility guarantees of these empirical risk minimization algorithms. Compared to a previous method for ensuring differential privacy subject to a relaxed form of the equalized odds fairness constraint, the $(\epsilon, \delta)$ differentially private algorithm we present provides asymptotically better sample complexity guarantees. The technique of using an approximate linear optimizer oracle to achieve privacy might be applicable to other problems not considered in this paper. Finally, we show an algorithm-agnostic lower bound on the accuracy of any solution to the problem of $(\epsilon, 0)$ or $(\epsilon, \delta)$ private constrained group-objective optimization.
Location Anomalies Detection for Connected and Autonomous Vehicles
Wang, Xiaoyang, Mavromatis, Ioannis, Tassi, Andrea, Santos-Rodriguez, Raul, Piechocki, Robert J.
Future Connected and Automated Vehicles (CAV), and more generally ITS, will form a highly interconnected system. Such a paradigm is referred to as the Internet of Vehicles (herein Internet of CAVs) and is a prerequisite to orchestrate traffic flows in cities. For optimal decision making and supervision, traffic centres will have access to suitably anonymized CAV mobility information. Safe and secure operations will then be contingent on early detection of anomalies. In this paper, a novel unsupervised learning model based on deep autoencoder is proposed to detect the self-reported location anomaly in CAVs, using vehicle locations and the Received Signal Strength Indicator (RSSI) as features. Quantitative experiments on simulation datasets show that the proposed approach is effective and robust in detecting self-reported location anomalies.
Radial Bayesian Neural Networks: Robust Variational Inference In Big Models
Farquhar, Sebastian, Osborne, Michael, Gal, Yarin
We propose Radial Bayesian Neural Networks: a variational distribution for mean field variational inference (MFVI) in Bayesian neural networks that is simple to implement, scalable to large models, and robust to hyperparameter selection. We hypothesize that standard MFVI fails in large models because of a property of the high-dimensional Gaussians used as posteriors. As variances grow, samples come almost entirely from a `soap-bubble' far from the mean. We show that the ad-hoc tweaks used previously in the literature to get MFVI to work served to stop such variances growing. Designing a new posterior distribution, we avoid this pathology in a theoretically principled way. Our distribution improves accuracy and uncertainty over standard MFVI, while scaling to large data where most other VI and MCMC methods struggle. We benchmark Radial BNNs in a real-world task of diabetic retinopathy diagnosis from fundus images, a task with ~100x larger input dimensionality and model size compared to previous demonstrations of MFVI.
An Enhanced Electrocardiogram Biometric Authentication System Using Machine Learning
Alkeem, Ebrahim Al, Kim, Song-Kyoo, Yeun, Chan Yeob, Zemerly, M. Jamal, Poon, Kin, Yoo, Paul D.
Traditional authentication systems use alphanumeric or graphical passwords, or token-based techniques that require "something you know and something you have". The disadvantages of these systems include the risks of forgetfulness, loss, and theft. To address these shortcomings, biometric authentication is rapidly replacing traditional authentication methods and is becoming an everyday part of life. The electrocardiogram (ECG) is one of the most recent traits considered for biometric purposes, and three typical use cases have been described: security checks, hospitals and wearable devices. Here we describe an ECG-based authentication system suitable for security checks and hospital environments. The proposed authentication system will help investigators studying ECG-based biometric authentication techniques to define dataset boundaries and to acquire high-quality training data. We evaluated the performance of the proposed system using a confusion matrix and also by applying the Amang ECG (amgecg) toolbox in MATLAB to investigate two parameters that directly affect the accuracy of authentication: the ECG slicing time (sliding window) and sampling time. Using this approach, we found that accuracy was optimized by using a sliding window of 0.4 s and a sampling time of 37 s.
A Contactless Artificial Intelligence System for Smart Devices Can Identify a Sign of Cardiac Arrest
Researchers at the University of Washington created a tool, which could potentially be developed into an application for smart speakers and smartphones, that uses algorithms and machine learning to identify instances of agonal breathing, a sign of cardiac arrest, with an accuracy of 97% at distances of up to 6 meters away. A contactless support vector machine (SVM), an artificial intelligence system that uses algorithms and machine learning, could be used by smart speakers and similar devices to detect agonal breathing, a symptom of potential cardiac arrest. The machine performs with 97% accuracy from a distance of up to 6 meters away, according to a study in Nature Partner Journals Digital Medicine. "A lot of people have smart speakers in their homes, and these devices have amazing capabilities that we can take advantage of," said sudy co-author Shyam Gollakota, PhD, associate professor at the University of Washington's Paul G. Allen School of Computer Science and Engineering, in a statement. "We envision a contactless system that works by continuously and passively monitoring the bedroom for an agonal breathing event, and alerts anyone nearby to come provide CPR. And then if there's no response, the device can automatically call 911."
Learning fair predictors with Sensitive Subspace Robustness
Yurochkin, Mikhail, Bower, Amanda, Sun, Yuekai
As artificial intelligence (AI) systems permeate our world, the problem of implicit biases in these systems have become more serious. AI systems are routinely used to make decisions or support the decision-making process in credit, hiring, criminal justice, and education, all of which are domains protected by anti-discrimination law. Although AI systems appear to eliminate the biases of a human decision maker, they may perpetuate or even exacerbate biases in the training data [64]. Such biases are especially objectionable when it adversely affects underprivileged groups of users [3]. Although the most obvious remedy is to remove the biases in the training data, this is impractical in most applications.