Goto

Collaborating Authors

 Rucker, Mark


Infinite Action Contextual Bandits with Reusable Data Exhaust

arXiv.org Artificial Intelligence

Those who ignore history are doomed to repeat it. A modern variant of this truth arises in controlled experimentation platforms, where offline procedures are a critical complement to online tests, e.g., supporting counterfactual evaluation strategies (Agarwal et al., 2016), offline model selection (Li et al., 2015), and prioritization of scarce online experimental resources (Gomez-Uribe & Hunt, 2015). Consequently, the utility of a learning algorithm is not solely determined by online performance, but also by the post-hoc utility of the data exhaust. The recent contribution of Zhu & Mineiro (2022) exemplifies this: an online contextual bandit algorithm for infinite action spaces with O(1) space and time complexity with respect to the action set. Unfortunately, this performance is achieved by sampling from a distribution which is not absolutely continuous with the reference measure. Therefore, a variety of post-hoc evaluation procedures that rely on importance-weighting cannot be applied, limiting adoption. In this paper, we describe an alternative approach to infinite action spaces which not only enjoys similar smooth regret guarantee (and empirical performance), but also utilizes sampling distributions with well defined importance-weights. In exchange, we pay an increased computational cost.


Personalized State Anxiety Detection: An Empirical Study with Linguistic Biomarkers and A Machine Learning Pipeline

arXiv.org Artificial Intelligence

Individuals high in social anxiety symptoms often exhibit elevated state anxiety in social situations. Research has shown it is possible to detect state anxiety by leveraging digital biomarkers and machine learning techniques. However, most existing work trains models on an entire group of participants, failing to capture individual differences in their psychological and behavioral responses to social contexts. To address this concern, in Study 1, we collected linguistic data from N=35 high socially anxious participants in a variety of social contexts, finding that digital linguistic biomarkers significantly differ between evaluative vs. non-evaluative social contexts and between individuals having different trait psychological symptoms, suggesting the likely importance of personalized approaches to detect state anxiety. In Study 2, we used the same data and results from Study 1 to model a multilayer personalized machine learning pipeline to detect state anxiety that considers contextual and individual differences. This personalized model outperformed the baseline F1-score by 28.0%. Results suggest that state anxiety can be more accurately detected with personalized machine learning approaches, and that linguistic biomarkers hold promise for identifying periods of state anxiety in an unobtrusive way.


Personalized Reward Learning with Interaction-Grounded Learning (IGL)

arXiv.org Artificial Intelligence

In an era of countless content offerings, recommender systems alleviate information overload by providing users with personalized content suggestions. Due to the scarcity of explicit user feedback, modern recommender systems typically optimize for the same fixed combination of implicit feedback signals across all users. However, this approach disregards a growing body of work highlighting that (i) implicit signals can be used by users in diverse ways, signaling anything from satisfaction to active dislike, and (ii) different users communicate preferences in different ways. We propose applying the recent Interaction Grounded Learning (IGL) paradigm to address the challenge of learning representations of diverse user communication modalities. Rather than requiring a fixed, human-designed reward function, IGL is able to learn personalized reward functions for different users and then optimize directly for the latent user satisfaction. We demonstrate the success of IGL with experiments using simulations as well as with real-world production traces. From shopping to reading the news, modern Internet users have access to an overwhelming amount of content and choices from online services. Recommender systems offer a way to improve user experience and decrease information overload by providing a customized selection of content. A key challenge for recommender systems is the rarity of explicit user feedback, such as ratings or likes/dislikes (Grčar et al., 2005). Rather than explicit feedback, practitioners typically use more readily available implicit signals, such as clicks (Hu et al., 2008), webpage dwell time (Yi et al., 2014), or inter-arrival times (Wu et al., 2017) as a proxy signal for user satisfaction. These implicit signals are used as the reward objective in recommender systems, with the popular Click-Through Rate (CTR) metric as the gold standard for the field (Silveira et al., 2019).


Inverse Reinforcement Learning for Strategy Identification

arXiv.org Artificial Intelligence

In adversarial environments, one side could gain an advantage by identifying the opponent's strategy. For example, in combat games, if an opponents strategy is identified as overly aggressive, one could lay a trap that exploits the opponent's aggressive nature. However, an opponent's strategy is not always apparent and may need to be estimated from observations of their actions. This paper proposes to use inverse reinforcement learning (IRL) to identify strategies in adversarial environments. Specifically, the contributions of this work are 1) the demonstration of this concept on gaming combat data generated from three pre-defined strategies and 2) the framework for using IRL to achieve strategy identification. The numerical experiments demonstrate that the recovered rewards can be identified using a variety of techniques. In this paper, the recovered reward are visually displayed, clustered using unsupervised learning, and classified using a supervised learner.


A Framework for Addressing the Risks and Opportunities In AI-Supported Virtual Health Coaches

arXiv.org Artificial Intelligence

Virtual coaching has rapidly evolved into a foundational component of modern clinical practice. At a time when healthcare professionals are in short supply and the demand for low-cost treatments is everincreasing, virtual health coaches (VHCs) offer intervention-ondemand for those limited by finances or geographic access to care. More recently, AIpowered virtual coaches have become a viable complement to human coaches. However, the push for AIpowered coaching systems raises several important issues for researchers, designers, clinicians, and patients. In this paper, we present a novel Figure 1: The figure shows four main domains of a successful framework to guide the design and development of virtual coaching virtual health coach throughout a data science pipeline.