AITopics

2503.22137

Country:

North America > United States > California (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningDec-10-2024

CANDOR: Counterfactual ANnotated DOubly Robust Off-Policy Evaluation

Mandyam, Aishwarya, Tang, Shengpu, Yao, Jiayu, Wiens, Jenna, Engelhardt, Barbara E.

Off-policy evaluation (OPE) provides safety guarantees by estimating the performance of a policy before deployment. Recent work introduced IS+, an importance sampling (IS) estimator that uses expert-annotated counterfactual samples to improve behavior dataset coverage. However, IS estimators are known to have high variance; furthermore, the performance of IS+ deteriorates when annotations are imperfect. In this work, we propose a family of OPE estimators inspired by the doubly robust (DR) principle. A DR estimator combines IS with a reward model estimate, known as the direct method (DM), and offers favorable statistical guarantees. We propose three strategies for incorporating counterfactual annotations into a DR-inspired estimator and analyze their properties under various realistic settings. We prove that using imperfect annotations in the DM part of the estimator best leverages the annotations, as opposed to using them in the IS part. To support our theoretical findings, we evaluate the proposed estimators in three contextual bandit environments. Our empirical results show that when the reward model is misspecified and the annotations are imperfect, it is most beneficial to use the annotations only in the DM portion of a DR estimator. Based on these theoretical and empirical insights, we provide a practical guide for using counterfactual annotations in different realistic settings.

annotation, data mining, machine learning, (19 more...)

2412.08052

Country: North America > United States (0.73)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.92)

arXiv.org Artificial IntelligenceNov-15-2023

Adaptive Interventions with User-Defined Goals for Health Behavior Change

Mandyam, Aishwarya, Joerke, Matthew, Engelhardt, Barbara E., Brunskill, Emma

Physical inactivity remains a major public health concern, having associations with adverse health outcomes such as cardiovascular disease and type-2 diabetes. Mobile health applications present a promising avenue for low-cost, scalable physical activity promotion, yet often suffer from small effect sizes and low adherence rates, particularly in comparison to human coaching. Goal-setting is a critical component of health coaching that has been underutilized in adaptive algorithms for mobile health interventions. This paper introduces a modification to the Thompson sampling algorithm that places emphasis on individualized goal-setting by optimizing personalized reward functions. As a step towards supporting goal-setting, this paper offers a balanced approach that can leverage shared structure while optimizing individual preferences and goals. We prove that our modification incurs only a constant penalty on the cumulative regret while preserving the sample complexity benefits of data sharing. In a physical activity simulator, we demonstrate that our algorithm achieves substantial improvements in cumulative regret compared to baselines that do not share data or do not optimize for individualized rewards.

adaptive intervention, artificial intelligence, machine learning, (14 more...)

2311.09483

Country: North America > United States (0.15)

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

arXiv.org Artificial IntelligenceOct-12-2023

Kernel Density Bayesian Inverse Reinforcement Learning

Mandyam, Aishwarya, Li, Didong, Cai, Diana, Jones, Andrew, Engelhardt, Barbara E.

Inverse reinforcement learning~(IRL) is a powerful framework to infer an agent's reward function by observing its behavior, but IRL algorithms that learn point estimates of the reward function can be misleading because there may be several functions that describe an agent's behavior equally well. A Bayesian approach to IRL models a distribution over candidate reward functions, alleviating the shortcomings of learning a point estimate. However, several Bayesian IRL algorithms use a $Q$-value function in place of the likelihood function. The resulting posterior is computationally intensive to calculate, has few theoretical guarantees, and the $Q$-value function is often a poor approximation for the likelihood. We introduce kernel density Bayesian IRL (KD-BIRL), which uses conditional kernel density estimation to directly approximate the likelihood, providing an efficient framework that, with a modified reward function parameterization, is applicable to environments with complex and infinite state spaces. We demonstrate KD-BIRL's benefits through a series of experiments in Gridworld environments and a simulated sepsis treatment task.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2303.06827

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceJun-14-2023

Bayesian Non-linear Latent Variable Modeling via Random Fourier Features

Zhang, Michael Minyi, Gundersen, Gregory W., Engelhardt, Barbara E.

The Gaussian process latent variable model (GPLVM) is a popular probabilistic method used for nonlinear dimension reduction, matrix factorization, and state-space modeling. Inference for GPLVMs is computationally tractable only when the data likelihood is Gaussian. Moreover, inference for GPLVMs has typically been restricted to obtaining maximum a posteriori point estimates, which can lead to overfitting, or variational approximations, which mischaracterize the posterior uncertainty. Here, we present a method to perform Markov chain Monte Carlo (MCMC) inference for generalized Bayesian nonlinear latent variable modeling. The crucial insight necessary to generalize GPLVMs to arbitrary observation models is that we approximate the kernel function in the Gaussian process mappings with random Fourier features; this allows us to compute the gradient of the posterior in closed form with respect to the latent variables. We show that we can generalize GPLVMs to non-Gaussian observations, such as Poisson, negative binomial, and multinomial distributions, using our random feature latent variable model (RFLVM). Our generalized RFLVMs perform on par with state-of-the-art latent variable models on a wide range of applications, including motion capture, images, and text data for the purpose of estimating the latent structure and imputing the missing data of these complex data sets. Keywords: Latent variable modeling, Gaussian processes, probabilistic modeling.

artificial intelligence, latent variable model, machine learning, (12 more...)

2306.08352

Country: North America > United States (0.67)

Genre: Research Report (0.50)

Industry: Banking & Finance (0.46)

arXiv.org Machine LearningMar-25-2021

Active multi-fidelity Bayesian online changepoint detection

Gundersen, Gregory W., Cai, Diana, Zhou, Chuteng, Engelhardt, Barbara E., Adams, Ryan P.

Online algorithms for detecting changepoints, or abrupt shifts in the behavior of a time series, are often deployed with limited resources, e.g., to edge computing settings such as mobile phones or industrial sensors. In these scenarios it may be beneficial to trade the cost of collecting an environmental measurement against the quality or "fidelity" of this measurement and how the measurement affects changepoint estimation. For instance, one might decide between inertial measurements or GPS to determine changepoints for motion. A Bayesian approach to changepoint detection is particularly appealing because we can represent our posterior uncertainty about changepoints and make active, cost-sensitive decisions about data fidelity to reduce this posterior uncertainty. Moreover, the total cost could be dramatically lowered through active fidelity switching, while remaining robust to changes in data distribution. We propose a multi-fidelity approach that makes cost-sensitive decisions about which data fidelity to collect based on maximizing information gain with respect to changepoints. We evaluate this framework on synthetic, video, and audio data and show that this information-based approach results in accurate predictions while reducing total cost.

fidelity, health & medicine, neural network, (22 more...)

2103.14224

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

arXiv.org Machine LearningJun-19-2020

Latent variable modeling with random features

Gundersen, Gregory W., Zhang, Michael Minyi, Engelhardt, Barbara E.

Gaussian process-based latent variable models are flexible and theoretically grounded tools for nonlinear dimension reduction, but generalizing to non-Gaussian data likelihoods within this nonlinear framework is statistically challenging. Here, we use random features to develop a family of nonlinear dimension reduction models that are easily extensible to non-Gaussian data likelihoods; we call these random feature latent variable models (RFLVMs). By approximating a nonlinear relationship between the latent space and the observations with a function that is linear with respect to random features, we induce closed-form gradients of the posterior distribution with respect to the latent variable. This allows the RFLVM framework to support computationally tractable nonlinear latent variable models for a variety of data likelihoods in the exponential family without specialized derivations. Our generalized RFLVMs produce results comparable with other state-of-the-art dimension reduction methods on diverse types of data, including neural spike train recordings, images, and text data.

artificial intelligence, machine learning, rflvm, (14 more...)

2006.11145

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

arXiv.org Machine LearningJun-1-2019

Patient-Specific Effects of Medication Using Latent Force Models with Gaussian Processes

Cheng, Li-Fang, Dumitrascu, Bianca, Zhang, Michael, Chivers, Corey, Draugelis, Michael, Li, Kai, Engelhardt, Barbara E.

Multi-output Gaussian processes (GPs) are a flexible Bayesian nonparametric framework that has proven useful in jointly modeling the physiological states of patients in medical time series data. However, capturing the short-term effects of drugs and therapeutic interventions on patient physiological state remains challenging. We propose a novel approach that models the effect of interventions as a hybrid Gaussian process composed of a GP capturing patient physiology convolved with a latent force model capturing effects of treatments on specific physiological features. This convolution of a multi-output GP with a GP including a causal time-marked kernel leads to a well-characterized model of the patients' physiological state responding to interventions. We show that our model leads to analytically tractable cross-covariance functions, allowing scalable inference. Our hierarchical model includes estimates of patient-specific effects but allows sharing of support across patients. Our approach achieves competitive predictive performance on challenging hospital data, where we recover patient-specific response to the administration of three common drugs: one antihypertensive drug and two anticoagulants.

cardiology, latent force model, vascular disease, (19 more...)

1906.00226

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.94)
(2 more...)

Technology:

Information Technology > Modeling & Simulation (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Machine LearningMay-23-2019

Sequential Gaussian Processes for Online Learning of Nonstationary Functions

Zhang, Michael Minyi, Dumitrascu, Bianca, Williamson, Sinead A., Engelhardt, Barbara E.

Many machine learning problems can be framed in the context of estimating functions, and often these are time-dependent functions that are estimated in real-time as observations arrive. Gaussian processes (GPs) are an attractive choice for modeling real-valued nonlinear functions due to their flexibility and uncertainty quantification. However, the typical GP regression model suffers from several drawbacks: i) Conventional GP inference scales $O(N^{3})$ with respect to the number of observations; ii) updating a GP model sequentially is not trivial; and iii) covariance kernels often enforce stationarity constraints on the function, while GPs with non-stationary covariance kernels are often intractable to use in practice. To overcome these issues, we propose an online sequential Monte Carlo algorithm to fit mixtures of GPs that capture non-stationary behavior while allowing for fast, distributed inference. By formulating hyperparameter optimization as a multi-armed bandit problem, we accelerate mixing for real time inference. Our approach empirically improves performance over state-of-the-art methods for online GP estimation in the context of prediction for simulated non-stationary data and hospital time series data.

computer based training, educational technology, hyperparameter, (23 more...)

1905.10003

Country: North America > United States > Texas (0.14)

Genre: Research Report (1.00)

Industry:

Education > Educational Setting > Online (0.40)
Health & Medicine > Therapeutic Area (0.30)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

arXiv.org Machine LearningOct-30-2017

How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility

Chaney, Allison J. B., Stewart, Brandon M., Engelhardt, Barbara E.

Recommendation systems occupy an expanding role in everyday decision making, from choice of movies and household goods to consequential medical and legal decisions. The data used to train and test these systems is algorithmically confounded in that it is the result of a feedback loop between human choices and an existing algorithmic recommendation system. Using simulations, we demonstrate that algorithmic confounding can disadvantage algorithms in training, bias held-out evaluation, and amplify homogenization of user behavior without gains in utility.

algorithm, artificial intelligence, recommendation system, (16 more...)

1710.11214

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Media (0.69)
Law (0.48)
Government (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)