Goto

Collaborating Authors

 Picard, Rosalind W.


DISSECT: Disentangled Simultaneous Explanations via Concept Traversals

arXiv.org Artificial Intelligence

Explaining deep learning model inferences is a promising venue for scientific understanding, improving safety, uncovering hidden biases, evaluating fairness, and beyond, as argued by many scholars. One of the principal benefits of counterfactual explanations is allowing users to explore "what-if" scenarios through what does not and cannot exist in the data, a quality that many other forms of explanation such as heatmaps and influence functions are inherently incapable of doing. However, most previous work on generative explainability cannot disentangle important concepts effectively, produces unrealistic examples, or fails to retain relevant information. We propose a novel approach, DISSECT, that jointly trains a generator, a discriminator, and a concept disentangler to overcome such challenges using little supervision. DISSECT generates Concept Traversals (CTs), defined as a sequence of generated examples with increasing degrees of concepts that influence a classifier's decision. By training a generative model from a classifier's signal, DISSECT offers a way to discover a classifier's inherent "notion" of distinct concepts automatically rather than rely on user-predefined concepts. We show that DISSECT produces CTs that (1) disentangle several concepts, (2) are influential to a classifier's decision and are coupled to its reasoning due to joint training (3), are realistic, (4) preserve relevant information, and (5) are stable across similar inputs. We validate DISSECT on several challenging synthetic and realistic datasets where previous methods fall short of satisfying desirable criteria for interpretability and show that it performs consistently well and better than existing methods.


Characterizing Sources of Uncertainty to Proxy Calibration and Disambiguate Annotator and Data Bias

arXiv.org Machine Learning

Supporting model interpretability for complex phenomena where annotators can legitimately disagree, such as emotion recognition, is a challenging machine learning task. In this work, we show that explicitly quantifying the uncertainty in such settings has interpretability benefits. We use a simple modification of a classical network inference using Monte Carlo dropout to give measures of epistemic and aleatoric uncertainty. We identify a significant correlation between aleatoric uncertainty and human annotator disagreement ($r\approx.3$). Additionally, we demonstrate how difficult and subjective training samples can be identified using aleatoric uncertainty and how epistemic uncertainty can reveal data bias that could result in unfair predictions. We identify the total uncertainty as a suitable surrogate for model calibration, i.e. the degree we can trust model's predicted confidence. In addition to explainability benefits, we observe modest performance boosts from incorporating model uncertainty.


Multi-modal Active Learning From Human Data: A Deep Reinforcement Learning Approach

arXiv.org Artificial Intelligence

Human behavior expression and experience are inherently multi-modal, and characterized by vast individual and contextual heterogeneity. To achieve meaningful human-computer and human-robot interactions, multi-modal models of the users states (e.g., engagement) are therefore needed. Most of the existing works that try to build classifiers for the users states assume that the data to train the models are fully labeled. Nevertheless, data labeling is costly and tedious, and also prone to subjective interpretations by the human coders. This is even more pronounced when the data are multi-modal (e.g., some users are more expressive with their facial expressions, some with their voice). Thus, building models that can accurately estimate the users states during an interaction is challenging. To tackle this, we propose a novel multi-modal active learning (AL) approach that uses the notion of deep reinforcement learning (RL) to find an optimal policy for active selection of the users data, needed to train the target (modality-specific) models. We investigate different strategies for multi-modal data fusion, and show that the proposed model-level fusion coupled with RL outperforms the feature-level and modality-specific models, and the naive AL strategies such as random sampling, and the standard heuristics such as uncertainty sampling. We show the benefits of this approach on the task of engagement estimation from real-world child-robot interactions during an autism therapy. Importantly, we show that the proposed multi-modal AL approach can be used to efficiently personalize the engagement classifiers to the target user using a small amount of actively selected users data.


Meta-Weighted Gaussian Process Experts for Personalized Forecasting of AD Cognitive Changes

arXiv.org Machine Learning

We introduce a novel personalized Gaussian Process Experts (pGPE) model for predicting per-subject ADAS-Cog13 cognitive scores -- a significant predictor of Alzheimer's Disease (AD) in the cognitive domain -- over the future 6, 12, 18, and 24 months. We start by training a population-level model using multi-modal data from previously seen subjects using a base Gaussian Process (GP) regression. Then, we personalize this model by adapting the base GP sequentially over time to a new (target) subject using domain adaptive GPs, and also by training subject-specific GP. While we show that these models achieve improved performance when selectively applied to the forecasting task (one performs better than the other on different subjects/visits), the average performance per model is suboptimal. To this end, we used the notion of meta learning in the proposed pGPE to design a regression-based weighting of these expert models, where the expert weights are optimized for each subject and his/her future visit. The results on a cohort of subjects from the ADNI dataset show that this newly introduced personalized weighting of the expert models leads to large improvements in accurately forecasting future ADAS-Cog13 scores and their fine-grained changes associated with the AD progression. This approach has potential to help identify at-risk patients early and improve the construction of clinical trials for AD.


Personalized Gaussian Processes for Future Prediction of Alzheimer's Disease Progression

arXiv.org Machine Learning

In this paper, we introduce the use of a personalized Gaussian Process model (pGP) to predict the key metrics of Alzheimer's Disease progression (MMSE, ADAS-Cog13, CDRSB and CS) based on each patient's previous visits. We start by learning a population-level model using multi-modal data from previously seen patients using the base Gaussian Process (GP) regression. Then, this model is adapted sequentially over time to a new patient using domain adaptive GPs to form the patient's pGP. We show that this new approach, together with an auto-regressive formulation, leads to significant improvements in forecasting future clinical status and cognitive scores for target patients when compared to modeling the population with traditional GPs.


Invited Talk Abstracts

AAAI Conferences

Thomas K. Landauer (Pearson Knowledge Technologies) The recently created word maturity (WM) metric uses the computational language model LSA to mimic the average evolutionary growth of individual word and paragraph knowledge as a function of the total amount and order of simulated reading. The simulator traces the separate growth trajectories of an unlimited number of different words from the beginning of reading to adult level.


Hyperparameter and Kernel Learning for Graph Based Semi-Supervised Classification

Neural Information Processing Systems

There have been many graph-based approaches for semi-supervised classification. Oneproblem is that of hyperparameter learning: performance depends greatly on the hyperparameters of the similarity graph, transformation ofthe graph Laplacian and the noise model. We present a Bayesian framework for learning hyperparameters for graph-based semisupervised classification.Given some labeled data, which can contain inaccurate labels, we pose the semi-supervised classification as an inference problemover the unknown labels. Expectation Propagation is used for approximate inference and the mean of the posterior is used for classification. The hyperparameters are learned using EM for evidence maximization. We also show that the posterior mean can be written in terms of the kernel matrix, providing a Bayesian classifier to classify new points. Tests on synthetic and real datasets show cases where there are significant improvements in performance over the existing approaches.



Response to Sloman's Review of Affective Computing

AI Magazine

Affective cues are a natural way that humans give feedback to learning systems. My students and I currently use tools of expression recognition to gather data to hone the abilities of our research systems, always with the consent nontechnical users are in the majority, of those involved. However, Sloman's to Aaron Sloman for his their feelings and fears demand not remarks imply that I favor Sloman was one I use the expression emotion recognition even the relatively benign intrusions, of the first in the AI community to only when established as shorthand such as emotional agents that jiggle write about the role of emotion in for the unwieldy but more accurate about on the screen, smiling at you in computing (Sloman and Croucher description "inference of an an annoying and inappropriate fashion, 1981), and I value his insight into theories emotional state from observations of costing you precious time while of emotional and intelligent systems. The Although inappropriate use of affect largely on some details related to computer cannot directly read internal might be the most common affront unknown features of human emotion; thoughts or feelings, and therefore, with this technology, there are also hence, I don't think the review captures there is no "emotion detector" as potentially more serious problems the flavor of the book. It can detect certain expressions (chapter 4.) he does raise interesting points, as well that arise in conjunction with an Sloman writes that in lieu of being as potential misunderstandings, both internal state: pressure profiles of hooked up to emotion-sensing of which I am grateful for the opportunity banging on a mouse, video signals of devices, he would prefer us all to to comment on. What Sloman misses in more. The aphorism "if you detect in the foreseeable future is teacher and pupil." These users tend to not desires. In contexts where humans wake-up call to us: Current forms of understand the limits of the technology; interact with computers naturally and computer-mediated interaction limit they are already so amazed at what socially (Reeves and Nass 1996), we affective communication. For example, the computer computer, "Does it know that I don't might speed up if we seem Sloman's review might seem confusing like it?" At one time, I would have discounted bored, offer an alternate explanation if in places whether or not you've read such remarks, but now that we appear confused, and try to my book. When the athlete rattles off her list of feelings to the public eye, she rattles off not just what she thinks she feels but able to a misunderstanding about what or otherwise. In this flurry of comes from the Latin sentire, the root of modulation, which indeed exist, thoughts and feelings, she anticipates the words sentiment and sensation.) Sentic especially given an incomplete understanding an event and concludes, "The thought modulation, such as voice inflection, of the phenomena.