AITopics | optimistic planning

Collaborating Authors

optimistic planning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PlanninginMarkovDecisionProcesseswith Gap-DependentSampleComplexity

Neural Information Processing SystemsFeb-7-2026, 11:33:10 GMT

This problem-dependent sample complexityresult is expressed in terms of the sub-optimality gapsof the state-action pairs that are visited during exploration.

artificial intelligence, nth, planning & scheduling, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.70)

Add feedback

Optimistic Planning in Markov Decision Processes Using a Generative Model

Balázs Szörényi, Gunnar Kedenburg, Remi Munos

Neural Information Processing SystemsOct-3-2025, 05:48:44 GMT

Neural Information Processing Systems http://nips.cc/

generative model, markov decision process, optimistic planning

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Optimistic Planning in Markov Decision Processes Using a Generative Model

Neural Information Processing SystemsSep-30-2025, 10:53:05 GMT

We consider the problem of online planning in a Markov decision process with discounted rewards for any given initial state. We consider the PAC sample complexity problem of computing, with probability $1-\delta$, an $\epsilon$-optimal action using the smallest possible number of calls to the generative model (which provides reward and next-state samples). We design an algorithm, called StOP (for Stochastic-Optimistic Planning), based on the optimism in the face of uncertainty principle. StOP can be used in the general setting, requires only a generative model, and enjoys a complexity bound that only depends on the local structure of the MDP.

markov decision process, name change, optimistic planning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Optimistic planning in Markov decision processes using a generative model

Neural Information Processing SystemsMar-13-2024, 12:16:26 GMT

We consider the problem of online planning in a Markov decision process with discounted rewards for any given initial state. We consider the PAC sample complexity problem of computing, with probability 1 δ, an ɛ-optimal action using the smallest possible number of calls to the generative model (which provides reward and next-state samples). We design an algorithm, called StOP (for Stochastic-Optimistic Planning), based on the "optimism in the face of uncertainty" principle. StOP can be used in the general setting, requires only a generative model, and enjoys a complexity bound that only depends on the local structure of the MDP.

algorithm, complexity, probability, (15 more...)

Neural Information Processing Systems

Country:

Europe > France (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Add feedback

Optimistic Planning by Regularized Dynamic Programming

Moulin, Antoine, Neu, Gergely

arXiv.org Artificial IntelligenceJun-14-2023

We propose a new method for optimistic planning in infinite-horizon discounted Markov decision processes based on the idea of adding regularization to the updates of an otherwise standard approximate value iteration procedure. This technique allows us to avoid contraction and monotonicity arguments typically required by existing analyses of approximate dynamic programming methods, and in particular to use approximate transition functions estimated via least-squares procedures in MDPs with linear function approximation. We use our method to recover known guarantees in tabular MDPs and to provide a computationally efficient algorithm for learning near-optimal policies in discounted linear mixture MDPs from a single stream of experience, and show it achieves near-optimal statistical guarantees.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2302.14004

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Exploration in Model-based Reinforcement Learning with Randomized Reward

Wang, Lingxiao, Li, Ping

arXiv.org Artificial IntelligenceJan-8-2023

Model-based Reinforcement Learning (MBRL) has been widely adapted due to its sample efficiency. However, existing worst-case regret analysis typically requires optimistic planning, which is not realistic in general. In contrast, motivated by the theory, empirical study utilizes ensemble of models, which achieve state-of-the-art performance on various testing environments. Such deviation between theory and empirical study leads us to question whether randomized model ensemble guarantee optimism, and hence the optimal worst-case regret? This paper partially answers such question from the perspective of reward randomization, a scarcely explored direction of exploration with MBRL. We show that under the kernelized linear regulator (KNR) model, reward randomization guarantees a partial optimism, which further yields a near-optimal worst-case regret in terms of the number of interactions. We further extend our theory to generalized function approximation and identified conditions for reward randomization to attain provably efficient exploration. Correspondingly, we propose concrete examples of efficient reward randomization. To the best of our knowledge, our analysis establishes the first worst-case regret analysis on randomized MBRL with function approximation.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2301.03142

Country:

North America > Canada > Quebec > Montreal (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(9 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Optimistic Planning in Markov Decision Processes Using a Generative Model

Szörényi, Balázs, Kedenburg, Gunnar, Munos, Remi

Neural Information Processing SystemsFeb-14-2020, 07:12:38 GMT

We consider the problem of online planning in a Markov decision process with discounted rewards for any given initial state. We consider the PAC sample complexity problem of computing, with probability $1-\delta$, an $\epsilon$-optimal action using the smallest possible number of calls to the generative model (which provides reward and next-state samples). We design an algorithm, called StOP (for Stochastic-Optimistic Planning), based on the optimism in the face of uncertainty" principle. StOP can be used in the general setting, requires only a generative model, and enjoys a complexity bound that only depends on the local structure of the MDP." Papers published at the Neural Information Processing Systems Conference.

generative model, markov decision process, optimistic planning

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Practical Open-Loop Optimistic Planning

Leurent, Edouard, Maillard, Odalric-Ambrym

arXiv.org Machine LearningApr-9-2019

We consider the problem of online planning in a Markov Decision Process when given only access to a generative model, restricted to open-loop policies - i.e. sequences of actions - and under budget constraint. In this setting, the Open-Loop Optimistic Planning (OLOP) algorithm enjoys good theoretical guarantees but is overly conservative in practice, as we show in numerical experiments. We propose a modified version of the algorithm with tighter upper-confidence bounds, KL-OLOP, that leads to better practical performances while retaining the sample complexity bound. Finally, we propose an efficient implementation that significantly improves the time complexity of both algorithms.

artificial intelligence, machine learning, sequence, (19 more...)

arXiv.org Machine Learning

1904.047

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback