AITopics | online pomdp planning

Collaborating Authors

online pomdp planning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Online POMDP Planning with Anytime Deterministic Guarantees

Neural Information Processing SystemsDec-27-2025, 07:04:13 GMT

Autonomous agents operating in real-world scenarios frequently encounter uncertainty and make decisions based on incomplete information. Planning under uncertainty can be mathematically formalized using partially observable Markov decision processes (POMDPs). However, finding an optimal plan for POMDPs can be computationally expensive and is feasible only for small tasks. In recent years, approximate algorithms, such as tree search and sample-based methodologies, have emerged as state-of-the-art POMDP solvers for larger problems. Despite their effectiveness, these algorithms offer only probabilistic and often asymptotic guarantees toward the optimal solution due to their dependence on sampling.

anytime deterministic guarantee, name change, online pomdp planning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Sampling Networks and Aggregate Simulation for Online POMDP Planning

Neural Information Processing SystemsDec-25-2025, 16:58:14 GMT

The paper introduces a new algorithm for planning in partially observable Markov decision processes (POMDP) based on the idea of aggregate simulation. The algorithm uses product distributions to approximate the belief state and shows how to build a representation graph of an approximate action-value function over belief space.

name change, representation, sampling network and aggregate simulation, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Online POMDP Planning with Anytime Deterministic Guarantees - Supplementary Anonymous Author(s) Affiliation Address email

Neural Information Processing SystemsOct-9-2025, 12:37:32 GMT

W e restate some of the definitions from the paper for convenience. Moreover, depending on the algorithm implementation, the number of iterations can be finite (e.g. by After the allowable time steps ended, the simulation was reset to its initial state. The main goal is to tag as many opponents as possible within a given time frame. The states reflect the agent's current position and the opponents' positions. The Baby POMDP is a classic problem that represents the scenario of a baby and a caregiver. The states in this problem represent the baby's needs, which could be hunger, The observations are binary, either the baby is crying or not.

artificial intelligence, machine learning, value function, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

HyPlan: Hybrid Learning-Assisted Planning Under Uncertainty for Safe Autonomous Driving

Pfaffmann, Donald, Klusch, Matthias, Steinmetz, Marcel

arXiv.org Artificial IntelligenceOct-9-2025

Abstract-- We present a novel hybrid learning-assisted planning method, named HyPlan, for solving the collision-free navigation problem for self-driving cars in partially observable traffic environments. HyPlan combines methods for multi-agent behavior prediction, deep reinforcement learning with proximal policy optimization and approximated online POMDP planning with heuristic confidence-based vertical pruning to reduce its execution time without compromising safety of driving. Our experimental performance analysis on the CARLA-CTS2 benchmark of critical traffic scenarios with pedestrians revealed that HyPlan may navigate safer than selected relevant baselines and perform significantly faster than considered alternative online POMDP planners. In general, the collision-free navigation (CFN) problem for a self-driving car is to minimize the time to a given goal while avoiding collisions with other objects such as other cars, pedestrians and cyclists in a partially observable traffic environment. This constrained optimization problem can be modeled as a POMDP and solved online by the autonomous vehicle (A V). While hybrid neuro-explicit planning methods such as LEADER [9], LFGnav [28] and HyLEAP [27] may navigate reasonably safe with provably correct action planning under uncertainty, they still suffer from significantly slower execution times compared to deep learning-based methods.

artificial intelligence, hyplan, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.0721

Country: Europe > Germany (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DESPOT: Online POMDP Planning with Regularization

Neural Information Processing SystemsSep-30-2025, 12:46:30 GMT

POMDPs provide a principled framework for planning under uncertainty, but are computationally intractable, due to the "curse of dimensionality" and the "curse of history". This paper presents an online lookahead search algorithm that alleviates these difficulties by limiting the search to a set of sampled scenarios. The execution of all policies on the sampled scenarios is summarized using a Determinized Sparse Partially Observable Tree (DESPOT), which is a sparsely sampled belief tree. Our algorithm, named Regularized DESPOT (R-DESPOT), searches the DESPOT for a policy that optimally balances the size of the policy and the accuracy on its value estimate obtained through sampling. We give an output-sensitive performance bound for all policies derived from the DESPOT, and show that R-DESPOT works well if a small optimal policy exists. We also give an anytime approximation to R-DESPOT. Experiments show strong results, compared with two of the fastest online POMDP algorithms.

despot, name change, online pomdp planning, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.61)

Add feedback

Reviews: Sampling Networks and Aggregate Simulation for Online POMDP Planning

Neural Information Processing SystemsJan-25-2025, 13:23:14 GMT

Author feedback: I thank the authors for the feedback. The feedback was of high quality and satisfied my concerns. I suggest that, perhaps a compressed version, of "Explaining limitations of our work" from the author feedback, which I enjoyed reading, will be added to the final version of the paper. The paper "Sampling Networks and Aggregate Simulation for Online POMDP Planning" proposes a new solution to computing policies for large POMDP problems that is based on factorizing the belief distribution using a mean field approximation during planning and execution and extending aggregate simulation to POMDPs. In short, the proposed POMDP planner projects factorized beliefs forward in time forming at the same time a computational graph and then computes gradients backwards in time over the graph to improve the policy.

approximation, online pomdp planning, sampling network and aggregate simulation, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Reviews: Sampling Networks and Aggregate Simulation for Online POMDP Planning

Neural Information Processing SystemsJan-25-2025, 13:23:03 GMT

All reviewers appreciate a practical approach to tackle POMDP in large state and observation space with factorized belief and aggregated simulation. Reviewers had some concern regarding the limitation of the work by the factorization assumption, but these concerns are addressed in author feedback. Reviewers are particularly happy about the quality of the rebuttal and encourage authors to incorporate the discussion of limitation of the algorithm in final draft.

limitation, online pomdp planning, sampling network and aggregate simulation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)

Add feedback

Online POMDP Planning with Anytime Deterministic Guarantees

Neural Information Processing SystemsJan-20-2025, 03:18:21 GMT

anytime deterministic guarantee, online pomdp planning, subset, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Scaling Long-Horizon Online POMDP Planning via Rapid State Space Sampling

Liang, Yuanchu, Kim, Edward, Thomason, Wil, Kingston, Zachary, Kurniawati, Hanna, Kavraki, Lydia E.

arXiv.org Artificial IntelligenceNov-11-2024

Partially Observable Markov Decision Processes (POMDPs) are a general and principled framework for motion planning under uncertainty. Despite tremendous improvement in the scalability of POMDP solvers, long-horizon POMDPs (e.g., $\geq15$ steps) remain difficult to solve. This paper proposes a new approximate online POMDP solver, called Reference-Based Online POMDP Planning via Rapid State Space Sampling (ROP-RaS3). ROP-RaS3 uses novel extremely fast sampling-based motion planning techniques to sample the state space and generate a diverse set of macro actions online which are then used to bias belief-space sampling and infer high-quality policies without requiring exhaustive enumeration of the action space -- a fundamental constraint for modern online POMDP solvers. ROP-RaS3 is evaluated on various long-horizon POMDPs, including on a problem with a planning horizon of more than 100 steps and a problem with a 15-dimensional state space that requires more than 20 look ahead steps. In all of these problems, ROP-RaS3 substantially outperforms other state-of-the-art methods by up to multiple folds.

artificial intelligence, machine learning, macro action, (15 more...)

arXiv.org Artificial Intelligence

2411.07032

Country: