intelligentpooling
IntelligentPooling: Practical Thompson Sampling for mHealth
Tomkins, Sabina, Liao, Peng, Klasnja, Predrag, Murphy, Susan
In mobile health (mHealth) smart devices deliver behavioral treatments repeatedly over time to a user with the goal of helping the user adopt and maintain healthy behaviors. Reinforcement learning appears ideal for learning how to optimally make these sequential treatment decisions. However, significant challenges must be overcome before reinforcement learning can be effectively deployed in a mobile healthcare setting. In this work we are concerned with the following challenges: 1) individuals who are in the same context can exhibit differential response to treatments 2) only a limited amount of data is available for learning on any one individual, and 3) non-stationary responses to treatment. To address these challenges we generalize Thompson-Sampling bandit algorithms to develop IntelligentPooling. IntelligentPooling learns personalized treatment policies thus addressing challenge one. To address the second challenge, IntelligentPooling updates each user's degree of personalization while making use of available data on other users to speed up learning. Lastly, IntelligentPooling allows responsivity to vary as a function of a user's time since beginning treatment, thus addressing challenge three. We show that IntelligentPooling achieves an average of 26% lower regret than state-of-the-art. We demonstrate the promise of this approach and its ability to learn from even a small group of users in a live clinical trial.
Rapidly Personalizing Mobile Health Treatment Policies with Limited Data
Tomkins, Sabina, Liao, Peng, Klasnja, Predrag, Yeung, Serena, Murphy, Susan
Mobile health (mHealth) interventions deliver treatments to users to support healthy behaviors. These interventions offer an opportunity for social impact in a diverse range of domains from substance abuse (Rabbi et al., 2017), to disease management (Hamine et al., 2015) to physical inactivity (Consolvo et al., 2008). For example, to help users increase their physical activity, an mHealth application might send a walking suggestions at times and in locations when a user is likely to be able to pursue the suggestions. The promise of mHealth hinges on the ability to provide interventions at times when users need the support and are receptive to it (Nahum-Shani et al., 2017). Consequently, in developing reinforcement learning (RL) algorithms for mHealth our goal is to be able to learn an optimal policy of when and how to intervene for a given user and context.