Plotting

 University of Wisconsin-Madison


Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners

AAAI Conferences

We investigate a problem at the intersection of machine learning and security: training-set attacks on machine learners. In such attacks an attacker contaminates the training data so that a specific learning algorithm would produce a model profitable to the attacker. Understanding training-set attacks is important as more intelligent agents (e.g. spam filters and robots) are equipped with learning capability and can potentially be hacked via data they receive from the environment. This paper identifies the optimal training-set attack on a broad family of machine learners. First we show that optimal training-set attack can be formulated as a bilevel optimization problem. Then we show that for machine learners with certain Karush-Kuhn-Tucker conditions we can solve the bilevel problem efficiently using gradient methods on an implicit function. As examples, we demonstrate optimal training-set attacks on Support VectorMachines, logistic regression, and linear regression with extensive experiments. Finally, we discuss potential defenses against such attacks.


Learning to Reject Sequential Importance Steps for Continuous-Time Bayesian Networks

AAAI Conferences

Applications of graphical models often require the use of approximate inference, such as sequential importance sampling (SIS), for estimation of the model distribution given partial evidence, i.e., the target distribution. However, when SIS proposal and target distributions are dissimilar, such procedures lead to biased estimates or require a prohibitive number of samples. We introduce ReBaSIS, a method that better approximates the target distribution by sampling variable by variable from existing importance samplers and accepting or rejecting each proposed assignment in the sequence: a choice made based on anticipating upcoming evidence. We relate the per-variable proposal and model distributions by expected weight ratios of sequence completions and show that we can learn accurate models of optimal acceptance probabilities from local samples. In a continuous-time domain, our method improves upon previous importance samplers by transforming an SIS problem into a machine learning one.


Modeling Human-Robot Interactions as Systems of Distributed Cognition

AAAI Conferences

Robots that are integrated into day-to-day settings as assistants, collaborators, and companions will engage in dynamic, physically-situated social interactions with their users. Enabling such interactions will require appropriate models and representations for interaction. In this paper, we argue that the dynamic, physically-situated interactions between humans and robots can be characterized as a system of distributed cognition, that this system can be represented using probabilistic graphical models (PGMs), and that the parameters of these models can be learned from human interactions. We illustrate the application of this perspective in our ongoing research on modeling dyadic referential communication.


Relational One-Class Classification: A Non-Parametric Approach

AAAI Conferences

One-class classification approaches have been proposed in the literature to learn classifiers from examples of only one class. But these approaches are not directly applicable to relational domains due to their reliance on a feature vector or a distance measure. We propose a non-parametric relational one-class classification approach based on first-order trees. We learn a tree-based distance measure that iteratively introduces new relational features to differentiate relational examples. We update the distance measure so as to maximize the one-class classification performance of our model. We also relate our model definition to existing work on probabilistic combination functions and density estimation. We experimentally show that our approach can discover relevant features and outperform three baseline approaches.


A Spatially Sensitive Kernel to Predict Cognitive Performance from Short-Term Changes in Neural Structure

AAAI Conferences

This paper introduces a novel framework for performing machine learning onlongitudinal neuroimaging datasets. These datasets are characterized by theirsize, particularly their width (millions of features per data input). Specifically, we address the problem of detecting subtle, short-term changes inneural structure that are indicative of cognitive change and correlate withrisk factors for Alzheimer's disease. We introduce a new spatially-sensitivekernel that allows us to reason about individuals, as opposed to populations. In doing so, this paper presents the first evidence demonstrating that verysmall changes in white matter structure over a two year period can predictchange in cognitive function in healthy adults.


Persistent Homology: An Introduction and a New Text Representation for Natural Language Processing

AAAI Conferences

Persistent homology is a mathematical tool from topological data analysis. It performs multi-scale analysis on a set of points and identifies clusters, holes, and voids therein. These latter topological structures complement standard feature representations, making persistent homology an attractive feature extractor for artificial intelligence. Research on persistent homology for AI is in its infancy, and is currently hindered by two issues: the lack of an accessible introduction to AI researchers, and the paucity of applications. In response, the first part of this paper presents a tutorial on persistent homology specifically aimed at a broader audience without sacrificing mathematical rigor. The second part contains one of the first applications of persistent homology to natural language processing. Specifically, our Similarity Filtration with Time Skeleton (SIFTS) algorithm identifies holes that can be interpreted as semantic "tie-backs" in a text document, providing a new document structure representation. We illustrate our algorithm on documents ranging from nursery rhymes to novels, and on a corpus with child and adolescent writings.


Learning When to Reject an Importance Sample

AAAI Conferences

When observations are incomplete or data are missing, approximate inference methods based on importance sampling are often used. Unfortunately, when the target and proposal distributions are dissimilar, the sampling procedure leads to biased estimates or requires a prohibitive number of samples. Our method approximates a multivariate target distribution by sampling from an existing, sequential importance sampler and accepting or rejecting the proposals. We develop the rejection-sampler framework and show we can learn the acceptance probabilities from local samples. In a continuous-time domain, we show our method improves upon previous importance samplers by transforming a sequential importance sampling problem into a machine learning one.


Machine Learning for Personalized Medicine: Predicting Primary Myocardial Infarction from Electronic Health Records

AI Magazine

Electronic health records (EHRs) are an emerging relational domain with large potential to improve clinical outcomes. We apply two statistical relational learning (SRL) algorithms to the task of predicting primary myocardial infarction. We show that one SRL algorithm, relational functional gradient boosting, outperforms propositional learners particularly in the medically-relevant high recall region. We observe that both SRL algorithms predict outcomes better than their propositional analogs and suggest how our methods can augment current epidemiological practices.


Machine Learning for Personalized Medicine: Predicting Primary Myocardial Infarction from Electronic Health Records

AI Magazine

Electronic health records (EHRs) are an emerging relational domain with large potential to improve clinical outcomes. We apply two statistical relational learning (SRL) algorithms to the task of predicting primary myocardial infarction. We show that one SRL algorithm, relational functional gradient boosting, outperforms propositional learners particularly in the medically-relevant high recall region. We observe that both SRL algorithms predict outcomes better than their propositional analogs and suggest how our methods can augment current epidemiological practices.


Identifying Adverse Drug Events by Relational Learning

AAAI Conferences

The pharmaceutical industry, consumer protection groups, users of medications and government oversight agencies are all strongly interested in identifying adverse reactions to drugs. While a clinical trial of a drug may use only a thousand patients, once a drug is released on the market it may be taken by millions of patients. As a result, in many cases adverse drug events (ADEs) are observed in the broader population that were not identified during clinical trials. Therefore, there is a need for continued, postmarketing surveillance of drugs to identify previously-unanticipated ADEs. This paper casts this problem as a reverse machine learning task, related to relational subgroup discovery and provides an initial evaluation of this approach based on experiments with an actual EMR/EHR and known adverse drug events.