Massachusetts Institute of Technology


Personalized Medication Dosing Using Volatile Data Streams

AAAI Conferences

One area of medicine that could benefit from personalized procedures is medication dosing. Mis-dosing medications may incur additional morbidity, or unnecessarily increase the length of patient stay. Here we illustrate a novel approach to personalized medication dosing that is robust to missing data, a common problem in the clinical care setting. We perform dose estimation using a novel take on multinomial logistic regression where model parameters are continuously estimated, for each patient, using a weighted combination of the data from a population of other patients, and a volatile data stream available from the individual under treatment. We evaluate our approach on 4,470 patients who received anti-coagulation therapy during intensive care treatment. Our approach was 29% more accurate than intensive care staff, and better able to distinguish outcomes than a non-personalized baseline (0.11 improvement in model VUS, a multiclass version of AUC). The advantages of our approach are its ease of interpretation, robustness to missing features, and extensibility to other problems with similar structure.


Gated Orthogonal Recurrent Units: On Learning to Forget

AAAI Conferences

We present a novel recurrent neural network (RNN) based model that combines the remembering ability of unitary RNNs with the ability of gated RNNs to effectively forget redundant/irrelevant information in its memory. We achieve this by extending unitary RNNs with a gating mechanism. Our model is able to outperform LSTMs, GRUs and Unitary RNNs on several long-term dependency benchmark tasks. We empirically both show the orthogonal/unitary RNNs lack the ability to forget and also the ability of GORU to simultaneously remember long term dependencies while forgetting irrelevant information. This plays an important role in recurrent neural networks. We provide competitive results along with an analysis of our model on many natural sequential tasks including the bAbI Question Answering, TIMIT speech spectrum prediction, Penn TreeBank, and synthetic tasks that involve long-term dependencies such as algorithmic, parenthesis, denoising and copying tasks.


RADMAX: Risk and Deadline Aware Planning for Maximum Utility

AAAI Conferences

Current network approaches aim to maximize network utilization when routing flows. While such approaches are fast and usually result in acceptable behavior, existing methods are not mission aware. There is no concept of utility maximization, no capability to handle flows with specified deadlines and loss requirements, and no guarantees over the probability of network saturation. In the presence of network degradation due to attacks, there is no guarantee that important flows will be properly transported. In this paper, we present RADMAX, a system for Risk And Deadline Aware Planning for Maximum Utility based on constraint programming, which allows us to handle higher level mission specifications. We show the correctness of RADMAX with respect to loss and delay bounds, provide results for the optimality of RADMAX with respect to the mission utility, and review current results on computational performance.


Decentralized High-Dimensional Bayesian Optimization With Factor Graphs

AAAI Conferences

This paper presents a novel decentralized high-dimensional Bayesian optimization (DEC-HBO) algorithm that, in contrast to existing HBO algorithms, can exploit the interdependent effects of various input components on the output of the unknown objective function f for boosting the BO performance and still preserve scalability in the number of input dimensions without requiring prior knowledge or the existence of a low (effective) dimension of the input space. To realize this, we propose a sparse yet rich factor graph representation of f to be exploited for designing an acquisition function that can be similarly represented by a sparse factor graph and hence be efficiently optimized in a decentralized manner using distributed message passing. Despite richly characterizing the interdependent effects of the input components on the output of f with a factor graph, DEC-HBO can still guarantee no-regret performance asymptotically. Empirical evaluation on synthetic and real-world experiments (e.g., sparse Gaussian process model with 1811 hyperparameters) shows that DEC-HBO outperforms the state-of-the-art HBO algorithms.


Model AI Assignments 2018

AAAI Conferences

The Model AI Assignments session seeks to gather and disseminate the best assignment designs of the Artificial Intelligence (AI) Education community. Recognizing that assignments form the core of student learning ex- perience, we here present abstracts of seven AI assign- ments from the 2018 session that are easily adoptable, playfully engaging, and flexible for a variety of instruc- tor needs.


Fact Checking in Community Forums

AAAI Conferences

Community Question Answering (cQA) forums are very popular nowadays, as they represent effective means for communities around particular topics to share information. Unfortunately, this information is not always factual. Thus, here we explore a new dimension in the context of cQA, which has been ignored so far: checking the veracity of answers to particular questions in cQA forums. As this is a new problem, we create a specialized dataset for it. We further propose a novel multi-faceted model, which captures information from the answer content (what is said and how), from the author profile (who says it), from the rest of the community forum (where it is said), and from external authoritative sources of information (external support). Evaluation results show a MAP value of 86.54, which is 21 points absolute above the baseline.


A Voting-Based System for Ethical Decision Making

AAAI Conferences

We present a general approach to automating ethical decisions, drawing on machine learning and computational social choice. In a nutshell, we propose to learn a model of societal preferences, and, when faced with a specific ethical dilemma at runtime, efficiently aggregate those preferences to identify a desirable choice. We provide a concrete algorithm that instantiates our approach; some of its crucial steps are informed by a new theory of swap-dominance efficient voting rules. Finally, we implement and evaluate a system for ethical decision making in the autonomous vehicle domain, using preference data collected from 1.3 million people through the Moral Machine website.


Deep Semi-Random Features for Nonlinear Function Approximation

AAAI Conferences

We propose semi-random features for nonlinear function approximation. The flexibility of semi-random feature lies between the fully adjustable units in deep learning and the random features used in kernel methods. For one hidden layer models with semi-random features, we prove with no unrealistic assumptions that the model classes contain an arbitrarily good function as the width increases (universality), and despite non-convexity, we can find such a good function (optimization theory) that generalizes to unseen new data (generalization bound). For deep models, with no unrealistic assumptions, we prove universal approximation ability, a lower bound on approximation error, a partial optimization guarantee, and a generalization bound. Depending on the problems, the generalization bound of deep semi-random features can be exponentially better than the known bounds of deep ReLU nets; our generalization error bound can be independent of the depth, the number of trainable weights as well as the input dimensionality. In experiments, we show that semi-random features can match the performance of neural networks by using slightly more units, and it outperforms random features by using significantly fewer units. Moreover, we introduce a new implicit ensemble method by using semi-random features.


Guiding Search in Continuous State-Action Spaces by Learning an Action Sampler From Off-Target Search Experience

AAAI Conferences

In robotics, it is essential to be able to plan efficiently in high-dimensional continuous state-action spaces for long horizons. For such complex planning problems, unguided uniform sampling of actions until a path to a goal is found is hopelessly inefficient, and gradient-based approaches often fall short when the optimization manifold of a given problem is not smooth. In this paper, we present an approach that guides search in continuous spaces for generic planners by learning an action sampler from past search experience. We use a Generative Adversarial Network (GAN) to represent an action sampler, and address an important issue: search experience consists of a relatively large number of actions that are not on a solution path and a relatively small number of actions that actually are on a solution path. We introduce a new technique, based on an importance-ratio estimation method, for using samples from a non-target distribution to make GAN learning more data-efficient. We provide theoretical guarantees and empirical evaluation in three challenging continuous robot planning problems to illustrate the effectiveness of our algorithm.