AITopics | pomdp model

Neural Information Processing Systems http://nips.cc/

artificial intelligence, bayesian inference, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.94)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.95)

Add feedback

QMDP-Net: Deep Learning for Planning under Partial Observability

Peter Karkus, David Hsu, Wee Sun Lee

Neural Information Processing SystemsNov-21-2025, 13:36:33 GMT

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.46)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback

A Bayesian Framework for Modeling Confidence in Perceptual Decision Making

Koosha Khalvati, Rajesh PN Rao

Neural Information Processing SystemsOct-2-2025, 02:53:22 GMT

The degree of confidence in one's choice or decision is a critical aspect of perceptual decision making. Attempts to quantify a decision maker's confidence by measuring accuracy in a task have yielded limited success because confidence and accuracy are typically not equal. In this paper, we introduce a Bayesian framework to model confidence in perceptual decision making. We show that this model, based on partially observable Markov decision processes (POMDPs), is able to predict confidence of a decision maker based only on the data available to the experimenter. We test our model on two experiments on confidence-based decision making involving the well-known random dots motion discrimination task. In both experiments, we show that our model's predictions closely match experimental data. Additionally, our model is also consistent with other phenomena such as the hard-easy effect in perceptual decision making.

coherence, decision maker, pomdp, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsAug-19-2025, 08:00:53 GMT

Too many technical details for simulations are missing-e.g., sure rewards, cost of sampling, discount factor. As noted on line 256, coherence is observable to the experimenter, but not to the subject. The virtues of normativity for modeling human behaviour are not adequately defended here. Why would normativity be good in and of itself? Does the model make any clear, testable predictions?

coherence, experimenter, pomdp, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.33)

Add feedback

QMDP-Net: Deep Learning for Planning under Partial Observability

Peter Karkus, David Hsu, Wee Sun Lee

Neural Information Processing SystemsOct-4-2024, 09:35:56 GMT

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDPnet on different tasks so that it can generalize to new ones in the parameterized task set and "transfer" to other similar tasks beyond the set. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.

algorithm, architecture, qmdp-net, (13 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.46)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback

A Bayesian Framework for Modeling Confidence in Perceptual Decision Making

Neural Information Processing SystemsMar-12-2024, 21:59:31 GMT

The degree of confidence in one's choice or decision is a critical aspect of perceptual decision making. Attempts to quantify a decision maker's confidence by measuring accuracy in a task have yielded limited success because confidence and accuracy are typically not equal. In this paper, we introduce a Bayesian framework to model confidence in perceptual decision making. We show that this model, based on partially observable Markov decision processes (POMDPs), is able to predict confidence of a decision maker based only on the data available to the experimenter. We test our model on two experiments on confidence-based decision making involving the well-known random dots motion discrimination task. In both experiments, we show that our model's predictions closely match experimental data. Additionally, our model is also consistent with other phenomena such as the hard-easy effect in perceptual decision making.

coherence, decision maker, pomdp, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Probabilistic Model of Social Decision Making based on Reward Maximization

Neural Information Processing SystemsMar-12-2024, 08:30:28 GMT

A fundamental problem in cognitive neuroscience is how humans make decisions, act, and behave in relation to other humans. Here we adopt the hypothesis that when we are in an interactive social setting, our brains perform Bayesian inference of the intentions and cooperativeness of others using probabilistic representations. We employ the framework of partially observable Markov decision processes (POMDPs) to model human decision making in a social context, focusing specifically on the volunteer's dilemma in a version of the classic Public Goods Game. We show that the POMDP model explains both the behavior of subjects as well as neural activity recorded using fMRI during the game. The decisions of subjects can be modeled across all trials using two interpretable parameters. Furthermore, the expected reward predicted by the model for each subject was correlated with the activation of brain areas related to reward expectation in social interactions. Our results suggest a probabilistic basis for human social decision making within the framework of expected reward maximization.

artificial intelligence, machine learning, pomdp, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Safe POMDP Online Planning via Shielding

Sheng, Shili, Parker, David, Feng, Lu

arXiv.org Artificial IntelligenceSep-18-2023

Partially observable Markov decision processes (POMDPs) have been widely used in many robotic applications for sequential decision-making under uncertainty. POMDP online planning algorithms such as Partially Observable Monte-Carlo Planning (POMCP) can solve very large POMDPs with the goal of maximizing the expected return. But the resulting policies cannot provide safety guarantees that are imperative for real-world safety-critical tasks (e.g., autonomous driving). In this work, we consider safety requirements represented as almost-sure reach-avoid specifications (i.e., the probability to reach a set of goal states is one and the probability to reach a set of unsafe states is zero). We compute shields that restrict unsafe actions violating almost-sure reach-avoid specifications. We then integrate these shields into the POMCP algorithm for safe POMDP online planning. We propose four distinct shielding methods, differing in how the shields are computed and integrated, including factored variants designed to improve scalability. Experimental results on a set of benchmark domains demonstrate that the proposed shielding methods successfully guarantee safety (unlike the baseline POMCP without shielding) on large POMDPs, with negligible impact on the runtime for online planning.

algorithm, online planning, pomdp, (16 more...)

arXiv.org Artificial Intelligence

2309.10216

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report (0.82)

Industry: Transportation (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

The Infinite Partially Observable Markov Decision Process

Neural Information Processing SystemsApr-6-2023, 14:07:08 GMT

The Partially Observable Markov Decision Process (POMDP) framework has proven useful in planning domains that require balancing actions that increase an agents knowledge and actions that increase an agents reward. Unfortunately, most POMDPs are complex structures with a large number of parameters. In many realworld problems, both the structure and the parameters are difficult to specify from domain knowledge alone. Recent work in Bayesian reinforcement learning has made headway in learning POMDP models; however, this work has largely focused on learning the parameters of the POMDP model. We define an infinite POMDP (iPOMDP) model that does not require knowledge of the size of the state space; instead, it assumes that the number of visited states will grow as the agent explores its world and explicitly models only visited states.

observable markov decision process, pomdp model

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Partially Observable Markov Decision Processes in Robotics: A Survey

Lauri, Mikko, Hsu, David, Pajarinen, Joni

arXiv.org Artificial IntelligenceSep-21-2022

Noisy sensing, imperfect control, and environment changes are defining characteristics of many real-world robot tasks. The partially observable Markov decision process (POMDP) provides a principled mathematical framework for modeling and solving robot decision and control tasks under uncertainty. Over the last decade, it has seen many successful applications, spanning localization and navigation, search and tracking, autonomous driving, multi-robot systems, manipulation, and human-robot interaction. This survey aims to bridge the gap between the development of POMDP models and algorithms at one end and application to diverse robot decision tasks at the other. It analyzes the characteristics of these tasks and connects them with the mathematical and algorithmic properties of the POMDP framework for effective modeling and solution. For practitioners, the survey provides some of the key task characteristics in deciding when and how to apply POMDPs to robot tasks successfully. For POMDP algorithm designers, the survey provides new insights into the unique challenges of applying POMDPs to robot systems and points to promising new directions for further research.

artificial intelligence, machine learning, pomdp, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TRO.2022.3200138

2209.10342

Country: