Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems
–Neural Information Processing Systems
We study Reinforcement Learning for partially observable dynamical systems using function approximation. We propose a new Partially Observable Bilinear Actor-Critic framework, that is general enough to include models such as observable tabular Partially Observable Markov Decision Processes (POMDPs), observable Linear-Quadratic-Gaussian (LQG), Predictive State Representations (PSRs), as well as a newly introduced model Hilbert Space Embeddings of POMDPs and observable POMDPs with latent low-rank transition.
Neural Information Processing Systems
Dec-27-2025, 15:54:12 GMT
- Country:
- Asia
- Japan > Honshū
- Kantō > Kanagawa Prefecture (0.04)
- Middle East > Jordan (0.04)
- Japan > Honshū
- Europe
- Germany > Hesse
- Darmstadt Region > Darmstadt (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Germany > Hesse
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.04)
- Asia