AITopics | confounded observational data

8252831b9fce7a49421e622c14ce0f65-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 18:45:39 GMT

algorithm, dtrs, online rl, (13 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

ProvablyEfficientCausalReinforcementLearning withConfoundedObservationalData

Neural Information Processing SystemsFeb-10-2026, 17:59:49 GMT

Empowered by neural networks, deep reinforcement learning (DRL) achieves tremendous empirical success. However, DRL requires a large dataset by interacting with the environment, which is unrealistic in critical scenarios such as autonomous driving and personalized medicine. In this paper, we study how to incorporate the dataset collected in the offline setting to improve the sample efficiency in the online setting. To incorporate the observational data, we face two challenges.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Neural Information Processing SystemsDec-24-2025, 18:31:50 GMT

Empowered by neural networks, deep reinforcement learning (DRL) achieves tremendous empirical success. However, DRL requires a large dataset by interacting with the environment, which is unrealistic in critical scenarios such as autonomous driving and personalized medicine. In this paper, we study how to incorporate the dataset collected in the offline setting to improve the sample efficiency in the online setting. To incorporate the observational data, we face two challenges.

confounded observational data, observational data, provably efficient causal reinforcement learning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback

evaluations overly harsh and would ask reviewers to reconsider our paper in the light of clarifications provided below. 2

Neural Information Processing SystemsOct-3-2025, 02:51:51 GMT

We thank the reviewers for their thoughtful feedback. The applications of online RL in health care are motivated by the increasing "use For experimental studies (e.g., RCTs) in DTRs, issues of sample Our analysis reveals that this is not the case. We really appreciate the reviewers for the helpful suggestions and references.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

b0b79da57b95837f14be95aaa4d54cf8-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 21:40:39 GMT

data mining, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Neural Information Processing SystemsJan-18-2025, 17:40:32 GMT

Empowered by neural networks, deep reinforcement learning (DRL) achieves tremendous empirical success. However, DRL requires a large dataset by interacting with the environment, which is unrealistic in critical scenarios such as autonomous driving and personalized medicine. In this paper, we study how to incorporate the dataset collected in the offline setting to improve the sample efficiency in the online setting. To incorporate the observational data, we face two challenges. To tackle such challenges, we propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.

confounded observational data, observational data, provably efficient causal reinforcement learning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Wang, Lingxiao, Yang, Zhuoran, Wang, Zhaoran

arXiv.org Machine LearningJun-22-2020

Empowered by expressive function approximators such as neural networks, deep reinforcement learning (DRL) achieves tremendous empirical successes. However, learning expressive function approximators requires collecting a large dataset (interventional data) by interacting with the environment. Such a lack of sample efficiency prohibits the application of DRL to critical scenarios, e.g., autonomous driving and personalized medicine, since trial and error in the online setting is often unsafe and even unethical. In this paper, we study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting. To incorporate the possibly confounded observational data, we propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner. More specifically, DOVI explicitly adjusts for the confounding bias in the observational data, where the confounders are partially observed or unobserved. In both cases, such adjustments allow us to construct the bonus based on a notion of information gain, which takes into account the amount of information acquired from the offline setting. In particular, we prove that the regret of DOVI is smaller than the optimal regret achievable in the pure online setting by a multiplicative factor, which decreases towards zero when the confounded observational data are more informative upon the adjustments. Our algorithm and analysis serve as a step towards causal reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Machine Learning

2006.12311

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Filters

Collaborating Authors

confounded observational data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

8252831b9fce7a49421e622c14ce0f65-AuthorFeedback.pdf

ProvablyEfficientCausalReinforcementLearning withConfoundedObservationalData

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

evaluations overly harsh and would ask reviewers to reconsider our paper in the light of clarifications provided below. 2

b0b79da57b95837f14be95aaa4d54cf8-Paper.pdf

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data