AITopics | qmdp-net

QMDP-Net: Deep Learning for Planning under Partial Observability

Neural Information Processing SystemsMar-17-2026, 18:22:51 GMT

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDP-net on different tasks so that it can generalize to new ones in the parameterized task set and "transfer" to other similar tasks beyond the set. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.79)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.61)

Add feedback

QMDP-Net: Deep Learning for Planning under Partial Observability

Neural Information Processing SystemsNov-21-2025, 16:12:45 GMT

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDP-net on different tasks so that it can generalize to new ones in the parameterized task set and "transfer" to other similar tasks beyond the set. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.

deep learning, name change, partial observability, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.79)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.61)

Add feedback

QMDP-Net: Deep Learning for Planning under Partial Observability

Peter Karkus, David Hsu, Wee Sun Lee

Neural Information Processing SystemsNov-21-2025, 13:36:33 GMT

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.46)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback

Reviews: QMDP-Net: Deep Learning for Planning under Partial Observability

Neural Information Processing SystemsOct-8-2024, 12:10:46 GMT

The paper proposes a novel policy network architecture for partially observable environments. The network includes a filtering module and a planning module. The filtering module mimics computation of the current belief of the agent given its previous belief, the last action and the last observation. The model of the environment and the observation function are replaced with trainable neural modules. The planning module runs value iteration for an MDP, whose transition function and reward function are also trainable neural modules.

deep learning, module, partial observability, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

QMDP-Net: Deep Learning for Planning under Partial Observability

Peter Karkus, David Hsu, Wee Sun Lee

Neural Information Processing SystemsOct-4-2024, 09:35:56 GMT

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDPnet on different tasks so that it can generalize to new ones in the parameterized task set and "transfer" to other similar tasks beyond the set. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.

algorithm, architecture, qmdp-net, (13 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.46)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback

QMDP-Net: Deep Learning for Planning under Partial Observability

Karkus, Peter, Hsu, David, Lee, Wee Sun

Neural Information Processing SystemsFeb-14-2020, 15:57:39 GMT

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDP-net on different tasks so that it can generalize to new ones in the parameterized task set and "transfer" to other similar tasks beyond the set.

deep learning, partial observability, qmdp-net, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Partially Observable Planning and Learning for Systems with Non-Uniform Dynamics

Collins, Nicholas, Kurniawati, Hanna

arXiv.org Artificial IntelligenceJul-9-2019

We propose a neural network architecture, called TransNet, that combines planning and model learning for solving Partially Observable Markov Decision Processes (POMDPs) with non-uniform system dynamics. The past decade has seen a substantial advancement in solving POMDP problems. However, constructing a suitable POMDP model remains difficult. Recently, neural network architectures have been proposed to alleviate the difficulty in acquiring such models. Although the results are promising, existing architectures restrict the type of system dynamics that can be learned --that is, system dynamics must be the same in all parts of the state space. TransNet relaxes such a restriction. Key to this relaxation is a novel neural network module that classifies the state space into classes and then learns the system dynamics of the different classes. TransNet uses this module together with the overall architecture of QMDP-Net[1] to allow solving POMDPs that have more expressive dynamic models, while maintaining efficient data requirement. Its evaluation on typical benchmarks in robot navigation with initially unknown system and environment models indicates that TransNet substantially out-performs the quality of the generated policies and learning efficiency of the state-of-the-art method QMDP-Net.

artificial intelligence, machine learning, transnet, (18 more...)

arXiv.org Artificial Intelligence

1907.04457

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

QMDP-Net: Deep Learning for Planning under Partial Observability

Karkus, Peter, Hsu, David, Lee, Wee Sun

Neural Information Processing SystemsDec-31-2017

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDP-net on different tasks so that it can generalize to new ones in the parameterized task set and “transfer” to other similar tasks beyond the set. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback

QMDP-Net: Deep Learning for Planning under Partial Observability

Karkus, Peter, Hsu, David, Lee, Wee Sun

arXiv.org Artificial IntelligenceNov-2-2017

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDP-net on different tasks so that it can generalize to new ones in the parameterized task set and "transfer" to other similar tasks beyond the set. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.

artificial intelligence, machine learning, qmdp-net, (18 more...)

arXiv.org Artificial Intelligence

1703.06692

Genre: