Goto

Collaborating Authors

 Industry


RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism

Neural Information Processing Systems

Accuracy and interpretability are two dominant features of successful predictive models. Typically, a choice must be made in favor of complex black box models such as recurrent neural networks (RNN) for accuracy versus less accurate but more interpretable traditional models such as logistic regression. This tradeoff poses challenges in medicine where both accuracy and interpretability are important. We addressed this challenge by developing the REverse Time AttentIoN model (RETAIN) for application to Electronic Health Records (EHR) data. RETAIN achieves high accuracy while remaining clinically interpretable and is based on a two-level neural attention model that detects influential past visits and significant clinical variables within those visits (e.g.


Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

Neural Information Processing Systems

Many practical perception systems exist within larger processes which often include interactions with users or additional components that are capable of evaluating the quality of predicted solutions. In these contexts, it is beneficial to provide these oracle mechanisms with multiple highly likely hypotheses rather than a single prediction. In this work, we pose the task of producing multiple outputs as a learning problem over an ensemble of deep networks -- introducing a novel stochastic gradient descent based approach to minimize the loss with respect to an oracle. Our method is simple to implement, agnostic to both architecture and loss function, and parameter-free. Our approach achieves lower oracle error compared to existing methods on a wide range of tasks and deep architectures. We also show qualitatively that solutions produced from our approach often provide interpretable representations of task ambiguity.


A Probabilistic Model of Social Decision Making based on Reward Maximization

Neural Information Processing Systems

A fundamental problem in cognitive neuroscience is how humans make decisions, act, and behave in relation to other humans. Here we adopt the hypothesis that when we are in an interactive social setting, our brains perform Bayesian inference of the intentions and cooperativeness of others using probabilistic representations. We employ the framework of partially observable Markov decision processes (POMDPs) to model human decision making in a social context, focusing specifically on the volunteer's dilemma in a version of the classic Public Goods Game. We show that the POMDP model explains both the behavior of subjects as well as neural activity recorded using fMRI during the game. The decisions of subjects can be modeled across all trials using two interpretable parameters. Furthermore, the expected reward predicted by the model for each subject was correlated with the activation of brain areas related to reward expectation in social interactions. Our results suggest a probabilistic basis for human social decision making within the framework of expected reward maximization.


Dialog-based Language Learning

Neural Information Processing Systems

A long-term goal of machine learning research is to build an intelligent dialog agent. Most research in natural language understanding has focused on learning from fixed training sets of labeled data, with supervision either at the word level (tagging, parsing tasks) or sentence level (question answering, machine translation). This kind of supervision is not realistic of how humans learn, where language is both learned by, and used for, communication. In this work, we study dialog-based language learning, where supervision is given naturally and implicitly in the response of the dialog partner during the conversation. We study this setup in two domains: the bAbI dataset of (Weston et al., 2015) and large-scale question answering from (Dodge et al., 2015). We evaluate a set of baseline learning strategies on these tasks, and show that a novel model incorporating predictive lookahead is a promising approach for learning from a teacher's response. In particular, a surprising result is that it can learn to answer questions correctly without any reward-based supervision at all.


A Non-parametric Learning Method for Confidently Estimating Patient's Clinical State and Dynamics

Neural Information Processing Systems

Estimating patient's clinical state from multiple concurrent physiological streams plays an important role in determining if a therapeutic intervention is necessary and for triaging patients in the hospital. In this paper we construct a non-parametric learning algorithm to estimate the clinical state of a patient. The algorithm addresses several known challenges with clinical state estimation such as eliminating bias introduced by therapeutic intervention censoring, increasing the timeliness of state estimation while ensuring a sufficient accuracy, and the ability to detect anomalous clinical states. These benefits are obtained by combining the tools of non-parametric Bayesian inference, permutation testing, and generalizations of the empirical Bernstein inequality. The algorithm is validated using real-world data from a cancer ward in a large academic hospital.


Fast Distributed Submodular Cover: Public-Private Data Summarization

Neural Information Processing Systems

In this paper, we introduce the public-private framework of data summarization motivated by privacy concerns in personalized recommender systems and online social services. Such systems have usually access to massive data generated by a large pool of users. A major fraction of the data is public and is visible to (and can be used for) all users. However, each user can also contribute some private data that should not be shared with other users to ensure her privacy. The goal is to provide a succinct summary of massive dataset, ideally as small as possible, from which customized summaries can be built for each user, i.e. it can contain elements from the public data (for diversity) and users' private data (for personalization). To formalize the above challenge, we assume that the scoring function according to which a user evaluates the utility of her summary satisfies submodularity, a widely used notion in data summarization applications.


What was Doge? How Elon Musk tried to gamify government

The Guardian

In 2025, when Elon Musk joined the government as the de facto head of something called the "department of government efficiency", he declared that governments were poorly configured "big dumb machines". To the senator Ted Cruz, he explained that "the only way to reconcile the databases and get rid of waste and fraud is to actually look at the computers". Muskism came to Washington soaked in memes, adolescent boasts and sadistic victory dances over mass firings. Leading a team of teenage coders and mid-level managers drawn from his suite of companies, Musk aimed to enter the codebase and rewrite regulations and budget lines from within. He would drag the paper-pushing bureaucracy kicking and screaming into the digital 21st century, scanning the contents of cavernous rooms of filing cabinets and feeding the data into a single interoperable system. The undertaking combined features of private equity-led restructuring with startup management, shot through with the sensibility of gaming and rightwing culture war. To succeed, he would need "God mode", an overview of the whole. If the mandate of Doge was to "[modernise] federal technology and software to maximise governmental efficiency and productivity", in the words of the executive order that launched the initiative on 20 January 2025, the reality was a strengthening of the state's surveillance capacities. Over time, Musk had become convinced that the real bugs in the code were people, especially the non-white illegal immigrants whom he saw as pawns in a liberal scheme to corrupt democracy and beneficiaries of what he called "suicidal empathy". He understood empathy itself in coding terms.


Tennessee minors sue Musk's xAI, alleging Grok generated sexual images of them

The Japan Times

Tennessee minors sue Musk's xAI, alleging Grok generated sexual images of them Governments and regulators around the world have launched probes into xAI, imposed bans and demanded safeguards in a growing push to curb illegal and offensive material. Three Tennessee plaintiffs, including two minors, sued Elon Musk's xAI on Monday, alleging that it knowingly designed its Grok image generator to let people create sexually explicit content by using real photos of others. The lawsuit, filed in the San Jose, California federal court, is seeking class-action status for people in the United States who were reasonably identifiable in sexualized images or videos generated by Grok based on real images of themselves. The artificial intelligence company did not immediately respond to a request for comment. After an outcry over sexually explicit content generated by the chatbot, xAI said in January that it had blocked all users from editing images of real people in revealing clothing and from generating images of people in revealing clothing in jurisdictions where it's illegal. Governments and regulators around the world have also since launched probes, imposed bans and demanded safeguards in a growing push to curb illegal and offensive material.


Equality of Opportunity in Classification: A Causal Approach

Neural Information Processing Systems

The Equalized Odds (for short, EO) is one of the most popular measures of discrimination used in the supervised learning setting. It ascertains fairness through the balance of the misclassification rates (false positive and negative) across the protected groups -- e.g., in the context of law enforcement, an African-American defendant who would not commit a future crime will have an equal opportunity of being released, compared to a non-recidivating Caucasian defendant. Despite this noble goal, it has been acknowledged in the literature that statistical tests based on the EO are oblivious to the underlying causal mechanisms that generated the disparity in the first place (Hardt et al. 2016). This leads to a critical disconnect between statistical measures readable from the data and the meaning of discrimination in the legal system, where compelling evidence that the observed disparity is tied to a specific causal process deemed unfair by society is required to characterize discrimination. The goal of this paper is to develop a principled approach to connect the statistical disparities characterized by the EO and the underlying, elusive, and frequently unobserved, causal mechanisms that generated such inequality. We start by introducing a new family of counterfactual measures that allows one to explain the misclassification disparities in terms of the underlying mechanisms in an arbitrary, non-parametric structural causal model. This will, in turn, allow legal and data analysts to interpret currently deployed classifiers through causal lens, linking the statistical disparities found in the data to the corresponding causal processes. Leveraging the new family of counterfactual measures, we develop a learning procedure to construct a classifier that is statistically efficient, interpretable, and compatible with the basic human intuition of fairness. We demonstrate our results through experiments in both real (COMPAS) and synthetic datasets.


Fast deep reinforcement learning using online adjustments from the past

Neural Information Processing Systems

We propose Ephemeral Value Adjusments (EVA): a means of allowing deep reinforcement learning agents to rapidly adapt to experience in their replay buffer. EVA shifts the value predicted by a neural network with an estimate of the value function found by prioritised sweeping over experience tuples from the replay buffer near the current state. EVA combines a number of recent ideas around combining episodic memory-like structures into reinforcement learning agents: slot-based storage, content-based retrieval, and memory-based planning. We show that EVA is performant on a demonstration task and Atari games.