AITopics | monte-carlo planning

You are a robot and you live in a Markov decision process (MDP) with a finite or an infinite number of transitions from state-action to next states. You got brains and so you plan before you act. Luckily, your roboparents equipped you with a generative model to do some Monte-Carlo planning. The world is waiting for you and you have no time to waste. You want your planning to be efficient.

artificial intelligence, machine learning, node, (16 more...)

Neural Information Processing Systems

Country: Europe (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

POLY-HOOT: Monte-CarloPlanninginContinuous SpaceMDPswithNon-AsymptoticAnalysis

Neural Information Processing SystemsFeb-8-2026, 00:03:03 GMT

Inthis paper, we consider Monte-Carlo planning in an environment with continuous state-action spaces, amuchlessunderstood problem withimportant applications in control and robotics.

algorithm, artificial intelligence, planning & scheduling, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.89)

Add feedback

0d85eb24e2add96ff1a7021f83c1abc9-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 11:33:18 GMT

algorithm, mdp-gape, sample complexity, (12 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Add feedback

PlanninginMarkovDecisionProcesseswith Gap-DependentSampleComplexity

Neural Information Processing SystemsFeb-7-2026, 11:33:10 GMT

This problem-dependent sample complexityresult is expressed in terms of the sub-optimality gapsof the state-action pairs that are visited during exploration.

artificial intelligence, nth, planning & scheduling, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.70)

Add feedback

POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis

Neural Information Processing SystemsDec-23-2025, 22:12:16 GMT

Monte-Carlo planning, as exemplified by Monte-Carlo Tree Search (MCTS), has demonstrated remarkable performance in applications with finite spaces. In this paper, we consider Monte-Carlo planning in an environment with continuous state-action spaces, a much less understood problem with important applications in control and robotics. We introduce POLY-HOOT, an algorithm that augments MCTS with a continuous armed bandit strategy named Hierarchical Optimistic Optimization (HOO) (Bubeck et al., 2011). Specifically, we enhance HOO by using an appropriate polynomial, rather than logarithmic, bonus term in the upper confidence bounds. Such a polynomial bonus is motivated by its empirical successes in AlphaGo Zero (Silver et al., 2017b), as well as its significant role in achieving theoretical guarantees of finite space MCTS (Shah et al., 2019). We investigate, for the first time, the regret of the enhanced HOO algorithm in non-stationary bandit problems. Using this result as a building block, we establish non-asymptotic convergence guarantees for POLY-HOOT: the value estimate converges to an arbitrarily small neighborhood of the optimal value function at a polynomial rate. We further provide experimental results that corroborate our theoretical findings.

continuous space mdp, monte-carlo planning, poly-hoot, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.86)

Add feedback

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Neural Information Processing SystemsOct-2-2025, 01:07:10 GMT

This problem-dependent sample complexity result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Add feedback

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Neural Information Processing SystemsOct-2-2025, 01:07:01 GMT

This problem-dependent sample complexity result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration.

algorithm, mdp-gape, sample complexity, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.41)

Add feedback

Review for NeurIPS paper: POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis

Neural Information Processing SystemsJan-23-2025, 03:52:44 GMT

Typically MCTS is just useful for discrete action settings and this paper studies the extension to continuous actions with the aim of theoretically justifying the approach taken. The approach is relevant to people interested in planning or people interested in continuous action control (e.g., robotics). The paper first extends an existing UCB-like algorithm for continuous-armed bandits, HOO, by using a polynomial exploration bonus instead of a logarithmic one. This approach is justified by a similar approach in the influential AlphaGo paper and prior work that justifies the approach theoretically for non-stationiary bandit problems. The paper then integrates this enhanced HOO into MCTS and calls the resulting algorithm Poly-HOOT. Theoretical results are given for convergence of approach to optimal action and empirical results show the method out-performs baselines. Overall, I liked the paper and think it clears the acceptance bar.

continuous space mdp, monte-carlo planning, non-asymptotic analysis, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.73)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.40)

Add feedback

Review for NeurIPS paper: POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis

Neural Information Processing SystemsJan-23-2025, 03:52:37 GMT

Reviewers all agreed that this paper presents a novel work towards MCTS in continuous action spaces, with its theoretical analysis making an important contribution.

continuous space mdp, monte-carlo planning, non-asymptotic analysis, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.40)

Add feedback

POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis

Neural Information Processing SystemsOct-9-2024, 21:27:19 GMT

Monte-Carlo planning, as exemplified by Monte-Carlo Tree Search (MCTS), has demonstrated remarkable performance in applications with finite spaces. In this paper, we consider Monte-Carlo planning in an environment with continuous state-action spaces, a much less understood problem with important applications in control and robotics. We introduce POLY-HOOT, an algorithm that augments MCTS with a continuous armed bandit strategy named Hierarchical Optimistic Optimization (HOO) (Bubeck et al., 2011). Specifically, we enhance HOO by using an appropriate polynomial, rather than logarithmic, bonus term in the upper confidence bounds. Such a polynomial bonus is motivated by its empirical successes in AlphaGo Zero (Silver et al., 2017b), as well as its significant role in achieving theoretical guarantees of finite space MCTS (Shah et al., 2019). We investigate, for the first time, the regret of the enhanced HOO algorithm in non-stationary bandit problems.

monte-carlo planning, non-asymptotic analysis, poly-hoot, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Filters

Collaborating Authors

monte-carlo planning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning

POLY-HOOT: Monte-CarloPlanninginContinuous SpaceMDPswithNon-AsymptoticAnalysis

0d85eb24e2add96ff1a7021f83c1abc9-Supplemental.pdf

PlanninginMarkovDecisionProcesseswith Gap-DependentSampleComplexity

POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Review for NeurIPS paper: POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis

Review for NeurIPS paper: POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis

POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis