Collaborating Authors

Learning Cost-Effective and Interpretable Regimes for Treatment Recommendation Machine Learning

Decision makers, such as doctors and judges, make crucial decisions such as recommending treatments to patients, and granting bails to defendants on a daily basis. Such decisions typically involve weighting the potential benefits of taking an action against the costs involved. In this work, we aim to automate this task of learning {cost-effective, interpretable and actionable treatment regimes. We formulate this as a problem of learning a decision list -- a sequence of if-then-else rules -- which maps characteristics of subjects (eg., diagnostic test results of patients) to treatments. We propose a novel objective to construct a decision list which maximizes outcomes for the population, and minimizes overall costs. We model the problem of learning such a list as a Markov Decision Process (MDP) and employ a variant of the Upper Confidence Bound for Trees (UCT) strategy which leverages customized checks for pruning the search space effectively. Experimental results on real world observational data capturing treatment recommendations for asthma patients demonstrate the effectiveness of our approach.

Targeted Optimal Treatment Regime Learning Using Summary Statistics Machine Learning

Personalized decision-making, aiming to derive optimal individualized treatment rules (ITRs) based on individual characteristics, has recently attracted increasing attention in many fields, such as medicine, social services, and economics. Current literature mainly focuses on estimating ITRs from a single source population. In real-world applications, the distribution of a target population can be different from that of the source population. Therefore, ITRs learned by existing methods may not generalize well to the target population. Due to privacy concerns and other practical issues, individual-level data from the target population is often not available, which makes ITR learning more challenging. We consider an ITR estimation problem where the source and target populations may be heterogeneous, individual data is available from the source population, and only the summary information of covariates, such as moments, is accessible from the target population. We develop a weighting framework that tailors an ITR for a given target population by leveraging the available summary statistics. Specifically, we propose a calibrated augmented inverse probability weighted estimator of the value function for the target population and estimate an optimal ITR by maximizing this estimator within a class of pre-specified ITRs. We show that the proposed calibrated estimator is consistent and asymptotically normal even with flexible semi/nonparametric models for nuisance function approximation, and the variance of the value estimator can be consistently estimated. We demonstrate the empirical performance of the proposed method using simulation studies and a real application to an eICU dataset as the source sample and a MIMIC-III dataset as the target sample.

$Q$- and $A$-Learning Methods for Estimating Optimal Dynamic Treatment Regimes Artificial Intelligence

In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.

A semiparametric instrumental variable approach to optimal treatment regimes under endogeneity Machine Learning

There is a fast-growing literature on estimating optimal treatment regimes based on randomized trials or observational studies under a key identifying condition of no unmeasured confounding. Because confounding by unmeasured factors cannot generally be ruled out with certainty in observational studies or randomized trials subject to noncompliance, we propose a general instrumental variable approach to learning optimal treatment regimes under endogeneity. Specifically, we provide sufficient conditions for the identification of both value function $E[Y_{\cD(L)}]$ for a given regime $\cD$ and optimal regime $\arg \max_{\cD} E[Y_{\cD(L)}]$ with the aid of a binary instrumental variable, when no unmeasured confounding fails to hold. We establish consistency of the proposed weighted estimators. We also extend the proposed method to identify and estimate the optimal treatment regime among those who would comply to the assigned treatment under monotonicity. In this latter case, we establish the somewhat surprising result that the complier optimal regime can be consistently estimated without directly collecting compliance information and therefore without the complier average treatment effect itself being identified. Furthermore, we propose novel semiparametric locally efficient and multiply robust estimators. Our approach is illustrated via extensive simulation studies and a data application on the effect of child rearing on labor participation.

Model-Based Reinforcement Learning for Sepsis Treatment Artificial Intelligence

Sepsis is a dangerous condition that is a leading cause of patient mortality. Treating sepsis is highly challenging, because individual patients respond very differently to medical interventions and there is no universally agreed-upon treatment for sepsis. In this work, we explore the use of continuous state-space model-based reinforcement learning (RL) to discover high-quality treatment policies for sepsis patients. Our quantitative evaluation reveals that by blending the treatment strategy discovered with RL with what clinicians follow, we can obtain improved policies, potentially allowing for better medical treatment for sepsis.