Goto

Collaborating Authors

 Adel, Tameem


Trustworthy Artificial Intelligence in the Context of Metrology

arXiv.org Artificial Intelligence

As background to the main story it is important to understand the meaning of artificial intelligence (AI), and more specifically how its subset machine learning (ML) fits into the picture. AI can be generally defined as the theory and development of computer systems that are able to perform tasks that normally require human intelligence. As such AI systems may be adept in discovering new information, making inferences and possessing reasoning capability. ML is a subset of AI focussing on AI methods that are able to learn and adapt. AI includes symbolic computation, such as expert systems, which are not a part of ML, whereas ML builds statistical models of data that may be used for classification and prediction tasks to aid decision-making. Here we focus on ML rather than AI, but will still use the term AI when referring to the more general technology.


Getting a CLUE: A Method for Explaining Uncertainty Estimates

arXiv.org Machine Learning

Both uncertainty estimation and interpretability are important factors for trustworthy machine learning systems. However, there is little work at the intersection of these two areas. We address this gap by proposing a novel method for interpreting uncertainty estimates from differentiable probabilistic models, like Bayesian Neural Networks (BNNs). Our method, Counterfactual Latent Uncertainty Explanations (CLUE), indicates how to change an input, while keeping it on the data manifold, such that a BNN becomes more confident about the input's prediction. We validate CLUE through 1) a novel framework for evaluating counterfactual explanations of uncertainty, 2) a series of ablation experiments, and 3) a user study. Our experiments show that CLUE outperforms baselines and enables practitioners to better understand which input patterns are responsible for predictive uncertainty.


Continual Learning with Adaptive Weights (CLAW)

arXiv.org Machine Learning

Approaches to continual learning aim to successfully learn a set of related tasks that arrive in an online manner. Recently, several frameworks have been developed which enable deep learning to be deployed in this learning scenario. A key modelling decision is to what extent the architecture should be shared across tasks. On the one hand, separately modelling each task avoids catastrophic forgetting but it does not support transfer learning and leads to large models. On the other hand, rigidly specifying a shared component and a task-specific part enables task transfer and limits the model size, but it is vulnerable to catastrophic forgetting and restricts the form of task-transfer that can occur. Ideally, the network should adaptively identify which parts of the network to share in a data driven way. Here we introduce such an approach called Continual Learning with Adaptive Weights (CLAW), which is based on probabilistic modelling and variational inference. Experiments show that CLAW achieves state-of-the-art performance on six benchmarks in terms of overall continual learning performance, as measured by classification accuracy, and in terms of addressing catastrophic forgetting.


Conditional Learning of Fair Representations

arXiv.org Artificial Intelligence

We propose a novel algorithm for learning fair representations that can simultaneously mitigate two notions of disparity among different demographic subgroups. Two key components underpinning the design of our algorithm are balanced error rate and conditional alignment of representations. In settings that have historically had discrimination, we are interested in defining fairness with respect to a protected group, the group which has historically been disadvantaged. Among many recent attempts to achieve algorithmic fairness (Dwork et al., 2012; Hardt et al., 2016; Zemel et al., 2013; Zafar et al., 2015), learning fair representations has attracted increasing attention However, it has long been empirically observed (Calders et al., 2009) and recently been proved (Zhao Part of this work was done when Han Zhao was visiting the V ector Institute, Toronto. In this work, we provide an affirmative answer to the above question by proposing an algorithm to align the conditional distributions (on the target variable) of representations across different demographic subgroups.


Learning Bayesian Networks with Incomplete Data by Augmentation

AAAI Conferences

We present new algorithms for learning Bayesian networks from data with missing values using a data augmentation approach. An exact Bayesian network learning algorithm is obtained by recasting the problem into a standard Bayesian network learning problem without missing data. As expected, the exact algorithm does not scale to large domains. We build on the exact method to create an approximate algorithm using a hill-climbing technique. This algorithm scales to large domains so long as a suitable standard structure learning method for complete data is available. We perform a wide range of experiments to demonstrate the benefits of learning Bayesian networks with such new approach.


Unsupervised Domain Adaptation with a Relaxed Covariate Shift Assumption

AAAI Conferences

The distributions can be different (Storkey and Sugiyama 2006; training and test domains are commonly referred to in the Ben-David and Urner 2012; 2014). Covariate shift is a valid domain adaptation literature as the source and target domains, assumption in some problems, but it can as well be quite respectively. Domain diversity can emerge as a result of the unrealistic for many other domain adaptation tasks where the scarcity of available labeled data from the target domain. It conditional label distributions are not (or, more precisely, not can as well be innate in the problem itself due to, for example, guaranteed to be) identical. The simplification resulting from an ongoing change occurring to the source domain like assuming identical labeling distributions facilitates the quest in cases where the original source domain keeps changing for a tractable learning algorithm, albeit possibly at the cost over time. Domain adaptation aims at finding solutions for of reducing the expressiveness power of the representation, this kind of problem, where the training (source) data are and consequently the accuracy of the resulting hypothesis.


Learning Bayesian Networks with Incomplete Data by Augmentation

arXiv.org Artificial Intelligence

We present new algorithms for learning Bayesian networks from data with missing values using a data augmentation approach. An exact Bayesian network learning algorithm is obtained by recasting the problem into a standard Bayesian network learning problem without missing data. To the best of our knowledge, this is the first exact algorithm for this problem. As expected, the exact algorithm does not scale to large domains. We build on the exact method to create an approximate algorithm using a hill-climbing technique. This algorithm scales to large domains so long as a suitable standard structure learning method for complete data is available. We perform a wide range of experiments to demonstrate the benefits of learning Bayesian networks with such new approach.


Automatic Variational ABC

arXiv.org Machine Learning

Approximate Bayesian Computation (ABC) is a framework for performing likelihood-free posterior inference for simulation models. Stochastic Variational inference (SVI) is an appealing alternative to the inefficient sampling approaches commonly used in ABC. However, SVI is highly sensitive to the variance of the gradient estimators, and this problem is exacerbated by approximating the likelihood. We draw upon recent advances in variance reduction for SV and likelihood-free inference using deterministic simulations to produce low variance gradient estimators of the variational lower-bound. By then exploiting automatic differentiation libraries we can avoid nearly all model-specific derivations. We demonstrate performance on three problems and compare to existing SVI algorithms. Our results demonstrate the correctness and efficiency of our algorithm.


A Probabilistic Covariate Shift Assumption for Domain Adaptation

AAAI Conferences

The aim of domain adaptation algorithms is to establish a learner, trained on labeled data from a source domain, that can classify samples from a target domain, in which few or no labeled data are available for training. Covariate shift, a primary assumption in several works on domain adaptation, assumes that the labeling functions of source and target domains are identical. We present a domain adaptation algorithm that assumes a relaxed version of covariate shift where the assumption that the labeling functions of the source and target domains are identical holds with a certain probability. Assuming a source deterministic large margin binary classifier, the farther a target instance is from the source decision boundary, the higher the probability that covariate shift holds. In this context, given a target unlabeled sample and no target labeled data, we develop a domain adaptation algorithm that bases its labeling decisions both on the source learner and on the similarities between the target unlabeled instances. The source labeling function decisions associated with probabilistic covariate shift, along with the target similarities are concurrently expressed on a similarity graph. We evaluate our proposed algorithm on a benchmark sentiment analysis (and domain adaptation) dataset, where state-of-the-art adaptation results are achieved. We also derive a lower bound on the performance of the algorithm.


Generative Multiple-Instance Learning Models For Quantitative Electromyography

arXiv.org Machine Learning

We present a comprehensive study of the use of generative modeling approaches for Multiple-Instance Learning (MIL) problems. In MIL a learner receives training instances grouped together into bags with labels for the bags only (which might not be correct for the comprised instances). Our work was motivated by the task of facilitating the diagnosis of neuromuscular disorders using sets of motor unit potential trains (MUPTs) detected within a muscle which can be cast as a MIL problem. Our approach leads to a state-of-the-art solution to the problem of muscle classification. By introducing and analyzing generative models for MIL in a general framework and examining a variety of model structures and components, our work also serves as a methodological guide to modelling MIL tasks. We evaluate our proposed methods both on MUPT datasets and on the MUSK1 dataset, one of the most widely used benchmarks for MIL.