Collaborating Authors

Learning Cost-Effective and Interpretable Regimes for Treatment Recommendation Machine Learning

Decision makers, such as doctors and judges, make crucial decisions such as recommending treatments to patients, and granting bails to defendants on a daily basis. Such decisions typically involve weighting the potential benefits of taking an action against the costs involved. In this work, we aim to automate this task of learning {cost-effective, interpretable and actionable treatment regimes. We formulate this as a problem of learning a decision list -- a sequence of if-then-else rules -- which maps characteristics of subjects (eg., diagnostic test results of patients) to treatments. We propose a novel objective to construct a decision list which maximizes outcomes for the population, and minimizes overall costs. We model the problem of learning such a list as a Markov Decision Process (MDP) and employ a variant of the Upper Confidence Bound for Trees (UCT) strategy which leverages customized checks for pruning the search space effectively. Experimental results on real world observational data capturing treatment recommendations for asthma patients demonstrate the effectiveness of our approach.

$Q$- and $A$-Learning Methods for Estimating Optimal Dynamic Treatment Regimes Artificial Intelligence

In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.

A semiparametric instrumental variable approach to optimal treatment regimes under endogeneity Machine Learning

There is a fast-growing literature on estimating optimal treatment regimes based on randomized trials or observational studies under a key identifying condition of no unmeasured confounding. Because confounding by unmeasured factors cannot generally be ruled out with certainty in observational studies or randomized trials subject to noncompliance, we propose a general instrumental variable approach to learning optimal treatment regimes under endogeneity. Specifically, we provide sufficient conditions for the identification of both value function $E[Y_{\cD(L)}]$ for a given regime $\cD$ and optimal regime $\arg \max_{\cD} E[Y_{\cD(L)}]$ with the aid of a binary instrumental variable, when no unmeasured confounding fails to hold. We establish consistency of the proposed weighted estimators. We also extend the proposed method to identify and estimate the optimal treatment regime among those who would comply to the assigned treatment under monotonicity. In this latter case, we establish the somewhat surprising result that the complier optimal regime can be consistently estimated without directly collecting compliance information and therefore without the complier average treatment effect itself being identified. Furthermore, we propose novel semiparametric locally efficient and multiply robust estimators. Our approach is illustrated via extensive simulation studies and a data application on the effect of child rearing on labor participation.

Model-Based Reinforcement Learning for Sepsis Treatment Artificial Intelligence

Sepsis is a dangerous condition that is a leading cause of patient mortality. Treating sepsis is highly challenging, because individual patients respond very differently to medical interventions and there is no universally agreed-upon treatment for sepsis. In this work, we explore the use of continuous state-space model-based reinforcement learning (RL) to discover high-quality treatment policies for sepsis patients. Our quantitative evaluation reveals that by blending the treatment strategy discovered with RL with what clinicians follow, we can obtain improved policies, potentially allowing for better medical treatment for sepsis.

Resampling-based Confidence Intervals for Model-free Robust Inference on Optimal Treatment Regimes Machine Learning

Recently, there has been growing interest in estimating optimal treatment regimes which are individualized decision rules that can achieve maximal average outcomes. This paper considers the problem of inference for optimal treatment regimes in the model-free setting, where the specification of an outcome regression model is not needed. Existing model-free estimators are usually not suitable for the purpose of inference because they either have nonstandard asymptotic distributions, or are designed to achieve fisher-consistent classification performance. This paper first studies a smoothed robust estimator that directly targets estimating the parameters corresponding to the Bayes decision rule for estimating the optimal treatment regime. This estimator is shown to have an asymptotic normal distribution. Furthermore, it is proved that a resampling procedure provides asymptotically accurate inference for both the parameters indexing the optimal treatment regime and the optimal value function. A new algorithm is developed to calculate the proposed estimator with substantially improved speed and stability. Numerical results demonstrate the satisfactory performance of the new methods.