Weathering Ongoing Uncertainty: Learning and Planning in a Time-Varying Partially Observable Environment

Puthumanaillam, Gokul, Liu, Xiangyu, Mehr, Negar, Ornik, Melkior

Jan-19-2024–arXiv.org Artificial Intelligence

Optimal decision-making presents a significant challenge for autonomous systems operating in uncertain, stochastic and time-varying environments. Environmental variability over time can significantly impact the system's optimal decision making strategy for mission completion. To model such environments, our work combines the previous notion of Time-Varying Markov Decision Processes (TVMDP) with partial observability and introduces Time-Varying Partially Observable Markov Decision Processes (TV-POMDP). We propose a two-pronged approach to accurately estimate and plan within the TV-POMDP: 1) Memory Prioritized State Estimation (MPSE), which leverages weighted memory to provide more accurate time-varying transition estimates; and 2) an MPSE-integrated planning strategy that optimizes long-term rewards while accounting for temporal constraint. We validate the proposed framework and algorithms using simulations and hardware, with robots exploring a partially observable, time-varying environments. Our results demonstrate superior performance over standard methods, highlighting the framework's effectiveness in stochastic, uncertain, time-varying domains.

artificial intelligence, machine learning, probability, (16 more...)

arXiv.org Artificial Intelligence

Jan-19-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > San Francisco County
    - San Francisco (0.14)
  - Illinois (0.14)

Genre:
- Research Report > New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Learning Graphical Models
    - Undirected Networks > Markov Models (1.00)
  - Representation & Reasoning (1.00)