skh
- Asia > Middle East > Jordan (0.04)
- North America > United States > Nevada (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (2 more...)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England (0.04)
ProvablyEfficientCausalReinforcementLearning withConfoundedObservationalData
Empowered by neural networks, deep reinforcement learning (DRL) achieves tremendous empirical success. However, DRL requires a large dataset by interacting with the environment, which is unrealistic in critical scenarios such as autonomous driving and personalized medicine. In this paper, we study how to incorporate the dataset collected in the offline setting to improve the sample efficiency in the online setting. To incorporate the observational data, we face two challenges.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom (0.04)
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- (2 more...)
- North America > United States > Texas > Brazos County > College Station (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England (0.04)
- Asia > Middle East > Jordan (0.04)
ProvablyEfficientReinforcementLearningwith LinearFunctionApproximationunderAdaptivity Constraints
Real-world reinforcement learning (RL) applications often come with possibly infinite state and action space, and in such a situation classical RL algorithms developed in the tabular setting are not applicable anymore. A popular approach to overcoming this issue is by applying function approximation techniques to the underlying structures of the Markovdecision processes (MDPs).