AITopics | mdp-gape

0d85eb24e2add96ff1a7021f83c1abc9-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 11:33:18 GMT

algorithm, mdp-gape, sample complexity, (12 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Add feedback

PlanninginMarkovDecisionProcesseswith Gap-DependentSampleComplexity

Neural Information Processing SystemsFeb-7-2026, 11:33:10 GMT

This problem-dependent sample complexityresult is expressed in terms of the sub-optimality gapsof the state-action pairs that are visited during exploration.

artificial intelligence, nth, planning & scheduling, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.70)

Add feedback

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Neural Information Processing SystemsDec-23-2025, 18:13:23 GMT

We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process in which transitions have a finite support. We prove an upper bound on the number of sampled trajectories needed for MDP-GapE to identify a near-optimal action with high probability. This problem-dependent result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration. Our experiments reveal that MDP-GapE is also effective in practice, in contrast with other algorithms with sample complexity guarantees in the fixed-confidence setting, that are mostly theoretical.

gap-dependent sample complexity, markov decision process, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.62)

Add feedback

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Neural Information Processing SystemsOct-2-2025, 01:07:10 GMT

This problem-dependent sample complexity result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Add feedback

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Neural Information Processing SystemsOct-2-2025, 01:07:01 GMT

This problem-dependent sample complexity result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration.

algorithm, mdp-gape, sample complexity, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Add feedback

The high-level intuition behind MDP-GapE is that unlike a purely optimistic policy, 2

Neural Information Processing SystemsOct-2-2025, 01:06:50 GMT

UCRL, and obtain the same guarantees (our current analysis uses Pinsker's inequality and does not fully exploit the KL

algorithm, artificial intelligence, mdp-gape, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.72)

Add feedback

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Neural Information Processing SystemsOct-9-2024, 12:47:00 GMT

We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process in which transitions have a finite support. We prove an upper bound on the number of sampled trajectories needed for MDP-GapE to identify a near-optimal action with high probability. This problem-dependent result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration. Our experiments reveal that MDP-GapE is also effective in practice, in contrast with other algorithms with sample complexity guarantees in the fixed-confidence setting, that are mostly theoretical.

gap-dependent sample complexity, markov decision process, mdp-gape, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Jonsson, Anders, Kaufmann, Emilie, Ménard, Pierre, Domingues, Omar Darwiche, Leurent, Edouard, Valko, Michal

arXiv.org Machine LearningJun-10-2020

We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process in which transitions have a finite support. We prove an upper bound on the number of calls to the generative models needed for MDP-GapE to identify a near-optimal action with high probability. This problem-dependent sample complexity result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration. Our experiments reveal that MDP-GapE is also effective in practice, in contrast with other algorithms with sample complexity guarantees in the fixed-confidence setting, that are mostly theoretical.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2006.05879

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Add feedback

Filters

Collaborating Authors

mdp-gape

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

0d85eb24e2add96ff1a7021f83c1abc9-Supplemental.pdf

PlanninginMarkovDecisionProcesseswith Gap-DependentSampleComplexity

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

The high-level intuition behind MDP-GapE is that unlike a purely optimistic policy, 2

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity