AITopics | Planning & Scheduling

Collaborating Authors

Planning & Scheduling

"Planning is the process of generating (possibly partial) representations of future behavior prior to the use of such plans to constrain or control that behavior. The outcome is usually a set of actions, with temporal and other constraints on them, for execution by some agent or agents. As a core aspect of human intelligence, planning has been studied since the earliest days of AI and cognitive science. Planning research has led to many useful tools for real-world applications, and has yielded significant insights into the organization of behavior and the nature of reasoning about actions."
– Planning entry by Austin Tate in the MIT Encyclopedia of Cognitive Science.

News Overviews Instructional Materials AI-Alerts Classics

Trajectory-Based Short-Sighted Probabilistic Planning

Neural Information Processing SystemsMar-14-2024, 11:54:35 GMT

Probabilistic planning captures the uncertainty of plan execution by probabilistically modeling the effects of actions in the environment, and therefore the probability of reaching different states from a given state and action. In order to compute a solution for a probabilistic planning problem, planners need to manage the uncertainty associated with the different paths from the initial state to a goal state. Several approaches to manage uncertainty were proposed, e.g., consider all paths at once, perform determinization of actions, and sampling. In this paper, we introduce trajectory-based short-sighted Stochastic Shortest Path Problems (SSPs), a novel approach to manage uncertainty for probabilistic planning problems in which states reachable with low probability are substituted by artificial goals that heuristically estimate their cost to reach a goal state. We also extend the theoretical results of Short-Sighted Probabilistic Planner (SSiPP) [1] by proving that SSiPP always finishes and is asymptotically optimal under sufficient conditions on the structure of short-sighted SSPs. We empirically compare SSiPP using trajectorybased short-sighted SSPs with the winners of the previous probabilistic planning competitions and other state-of-the-art planners in the triangle tireworld problems.

short-sighted ssp, ssipp, ssp, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.89)

Add feedback

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

Neural Information Processing SystemsMar-14-2024, 05:58:57 GMT

Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayesoptimal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems - because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.

algorithm, node, simulation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
(2 more...)

Add feedback

fe709c654eac84d5239d1a12a4f71877-Reviews.html

Neural Information Processing SystemsMar-14-2024, 01:08:14 GMT

The main idea is to sample several determinations of the system in the form of roll-out trees where each state/action pair has only one sampled successor. A combination of breadth-first and best-first search is used to explore the deterministic trees, and then they are recombined to create a stochastic model from which a policy can be calculated. The algorithm is proven to be consistent (as the number of trees and number of nodes in each tree both approach infinity, the value at the root can be arbitrarily approximated with high probability). The algorithm is empirically compared to an planning algorithm that requires a full transition model and performs well in comparison.

algorithm, approximation, asop, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.50)

Add feedback

Aggregating Optimistic Planning Trees for Solving Markov Decision Processes

Neural Information Processing SystemsMar-14-2024, 01:08:11 GMT

This paper addresses the problem of online planning in Markov decision processes using a randomized simulator, under a budget constraint. We propose a new algorithm which is based on the construction of a forest of planning trees, where each tree corresponds to a random realization of the stochastic environment. The trees are constructed using a "safe" optimistic planning strategy combining the optimistic principle (in order to explore the most promising part of the search space first) with a safety principle (which guarantees a certain amount of uniform exploration). In the decision-making step of the algorithm, the individual trees are aggregated and an immediate action is recommended. We provide a finite-sample analysis and discuss the trade-off between the principles of optimism and safety. We also report numerical results on a benchmark problem. Our algorithm performs as well as state-of-the-art optimistic planning algorithms, and better than a related algorithm which additionally assumes the knowledge of all transition distributions.

algorithm, budget, mdp, (15 more...)

Neural Information Processing Systems

Country: Europe (0.05)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Add feedback

Development of control algorithms for mobile robotics focused on their potential use for FPGA-based robots

Suárez-Gómez, Andrés-David, Ortega, Andres A. Hernandez

arXiv.org Artificial IntelligenceMar-14-2024

This paper investigates the development and optimization of control algorithms for mobile robotics, with a keen focus on their implementation in Field-Programmable Gate Arrays (FPGAs). It delves into both classical control approaches such as PID and modern techniques including deep learning, addressing their application in sectors ranging from industrial automation to medical care. The study highlights the practical challenges and advancements in embedding these algorithms into FPGAs, which offer significant benefits for mobile robotics due to their high-speed processing and parallel computation capabilities. Through an analysis of various control strategies, the paper showcases the improvements in robot performance, particularly in navigation and obstacle avoidance. It emphasizes the critical role of FPGAs in enhancing the efficiency and adaptability of control algorithms in dynamic environments. Additionally, the research discusses the difficulties in benchmarking and evaluating the performance of these algorithms in real-world applications, suggesting a need for standardized evaluation criteria. The contribution of this work lies in its comprehensive examination of control algorithms' potential in FPGA-based mobile robotics, offering insights into future research directions for improving robotic autonomy and operational efficiency.

algorithm, control algorithm, robot, (14 more...)

arXiv.org Artificial Intelligence

2403.09459

Country:

South America > Colombia (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > Russia (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.49)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

DESPOT: Online POMDP Planning with Regularization

Neural Information Processing SystemsMar-13-2024, 20:02:38 GMT

POMDPs provide a principled framework for planning under uncertainty, but are computationally intractable, due to the "curse of dimensionality" and the "curse of history". This paper presents an online POMDP algorithm that alleviates these difficulties by focusing the search on a set of randomly sampled scenarios.

algorithm, despot, scenario, (15 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

846c260d715e5b854ffad5f70a516c88-Reviews.html

Neural Information Processing SystemsMar-13-2024, 18:06:55 GMT

The paper proposes a Bayesian inference in Monte-Carlo tree search (MCTS) with Thompson sampling based action-selection strategy, called Dirichlet-NormalGamma MCTS (DNG-MTCS) algorithm. The method approximates the accumulated reward of following the current policy from a state, X_{s,\pi(s)}, by the normal distribution with the NromalGamma distribution prior. The state transition probabilities are estimated via Dirichlet distributions. Action-selection strategy is based on Thompson sampling approach, where the expected cumulative reward for each action is computed with the parametric distribution with parameters drawn from the posterior distributions and then the action with the highest expectation is selected. The authors apply the proposed method to several benchmark tasks and showed that the method can converge (slightly) faster than the UCT algorithm. Theoretical properties about convergence are also provided.

assumption, cumulative reward, normal distribution, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.39)

Add feedback

Bayesian Mixture Modeling and Inference based Thompson Sampling in Monte-Carlo Tree Search

Neural Information Processing SystemsMar-13-2024, 18:06:53 GMT

Monte-Carlo tree search (MCTS) has been drawing great interest in recent years for planning and learning under uncertainty. One of the key challenges is the trade-off between exploration and exploitation. To address this, we present a novel approach for MCTS using Bayesian mixture modeling and inference based Thompson sampling and apply it to the problem of online planning in MDPs. Our algorithm, named Dirichlet-NormalGamma MCTS (DNG-MCTS), models the uncertainty of the accumulated reward for actions in the search tree as a mixture of Normal distributions. We perform inferences on the mixture in Bayesian settings by choosing conjugate priors in the form of combinations of Dirichlet and NormalGamma distributions and select the best action at each decision node using Thompson sampling. Experimental results confirm that our algorithm advances the state-of-the-art UCT approach with better values on several benchmark problems.

algorithm, dng-mct, thompson, (16 more...)

Neural Information Processing Systems

Country: Asia > China (0.05)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
(2 more...)

Add feedback

Synthesizing Robust Plans under Incomplete Domain Models

Neural Information Processing SystemsMar-13-2024, 15:11:05 GMT

Most current planners assume complete domain models and focus on generating correct plans. Unfortunately, domain modeling is a laborious and error-prone task, thus real world agents have to plan with incomplete domain models. While domain experts cannot guarantee completeness, often they are able to circumscribe the incompleteness of the model by providing annotations as to which parts of the domain model may be incomplete. In such cases, the goal should be to synthesize plans that are robust with respect to any known incompleteness of the domain. In this paper, we first introduce annotations expressing the knowledge of the domain incompleteness and formalize the notion of plan robustness with respect to an incomplete domain model. We then show an approach to compiling the problem of finding robust plans to the conformant probabilistic planning problem, and present experimental results with Probabilistic-FF planner.

domain model, precondition and effect, robustness, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Convergence of Monte Carlo Tree Search in Simultaneous Move Games Marc Lanctot

Neural Information Processing SystemsMar-13-2024, 14:55:20 GMT

We study Monte Carlo tree search (MCTS) in zero-sum extensive-form games with perfect information and simultaneous moves. We present a general template of MCTS algorithms for these games, which can be instantiated by various selection methods. We formally prove that if a selection method is ɛ-Hannan consistent in a matrix game and satisfies additional requirements on exploration, then the MCTS algorithm eventually converges to an approximate Nash equilibrium (NE) of the extensive-form game. We empirically evaluate this claim using regret matching and Exp3 as the selection methods on randomly generated games and empirically selected worst case games. We confirm the formal result and show that additional MCTS variants also converge to approximate NE on the evaluated games.

algorithm, convergence, matrix game, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Europe > Czechia > Prague (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback