handempty
ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning
Liang, Yichao, Nguyen, Dat, Yang, Cambridge, Li, Tianyang, Tenenbaum, Joshua B., Rasmussen, Carl Edward, Weller, Adrian, Tavares, Zenna, Silver, Tom, Ellis, Kevin
Long-horizon embodied planning is challenging because the world does not only change through an agent's actions: exogenous processes (e.g., water heating, dominoes cascading) unfold concurrently with the agent's actions. We propose a framework for abstract world models that jointly learns (i) symbolic state representations and (ii) causal processes for both endogenous actions and exogenous mechanisms. Each causal process models the time course of a stochastic cause-effect relation. We learn these world models from limited data via variational Bayesian inference combined with LLM proposals. Across five simulated tabletop robotics environments, the learned models enable fast planning that generalizes to held-out tasks with more objects and more complex goals, outperforming a range of baselines.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts (0.04)
- Research Report (0.63)
- Workflow (0.46)
Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning
Verma, Pulkit, La, Ngoc, Favier, Anthony, Mishra, Swaroop, Shah, Julie A.
Large language models (LLMs) have demonstrated impressive capabilities across diverse tasks, yet their ability to perform structured symbolic planning remains limited, particularly in domains requiring formal representations like the Planning Domain Definition Language (PDDL). In this paper, we present a novel instruction tuning framework, PDDL-Instruct, designed to enhance LLMs' symbolic planning capabilities through logical chain-of-thought reasoning. Our approach focuses on teaching models to rigorously reason about action applicability, state transitions, and plan validity using explicit logical inference steps. By developing instruction prompts that guide models through the precise logical reasoning required to determine when actions can be applied in a given state, we enable LLMs to self-correct their planning processes through structured reflection. The framework systematically builds verification skills by decomposing the planning process into explicit reasoning chains about precondition satisfaction, effect application, and invariant preservation. Experimental results on multiple planning domains show that our chain-of-thought reasoning based instruction-tuned models are significantly better at planning, achieving planning accuracy of up to 94% on standard benchmarks, representing a 66% absolute improvement over baseline models. This work bridges the gap between the general reasoning capabilities of LLMs and the logical precision required for automated planning, offering a promising direction for developing better AI planning systems.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
Understanding Physical Properties of Unseen Deformable Objects by Leveraging Large Language Models and Robot Actions
Park, Changmin, Lee, Beomjoon, Jung, Haechan, Jung, Haejin, Nam, Changjoo
In this paper, we consider the problem of understanding the physical properties of unseen objects through interactions between the objects and a robot. Handling unseen objects with special properties such as deformability is challenging for traditional task and motion planning approaches as they are often with the closed world assumption. Recent results in Large Language Models (LLMs) based task planning have shown the ability to reason about unseen objects. However, most studies assume rigid objects, overlooking their physical properties. We propose an LLM-based method for probing the physical properties of unseen deformable objects for the purpose of task planning. For a given set of object properties (e.g., foldability, bendability), our method uses robot actions to determine the properties by interacting with the objects. Based on the properties examined by the LLM and robot actions, the LLM generates a task plan for a specific domain such as object packing. In the experiment, we show that the proposed method can identify properties of deformable objects, which are further used for a bin-packing task where the properties take crucial roles to succeed.
Bridging the Reasoning Gap: Small LLMs Can Plan with Generalised Strategies
Borro, Andrey, Riddle, Patricia J, Barley, Michael W, Witbrock, Michael J
Recent advancements in the reasoning skills of Large Language Models (LLMs) demonstrate an increase in the ability of LLMs to solve simple planning tasks. However, as long as the driving force behind improved reasoning capability is the size and complexity of the model, the financial and computational costs associated with running them will also increase. This trend raises questions about continued accessibility and whether these improvements will increase at the same pace as models continue to grow in size and expense. We propose two approaches to enhance the reasoning ability of less resource-intensive LLMs. (1) Provide them with a generalised strategy for solving tasks within a given domain, generated by a more resource-intensive LLM. (2) Exploit their cost-effectiveness by iteratively prompting these models to correct errors in their proposed solutions. Our empirical results from planning and mathematical reasoning tasks demonstrate that these methods improve the performance of less resource-intensive LLMs to levels comparable with their more resource-intensive counterparts, at a fraction of the cost. Additionally, we show that the utilisation of generalised strategies in our experiments reduced the cost of the less resource-intensive model by nearly 30 percent on average.
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- Europe > Switzerland (0.04)
LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench
Valmeekam, Karthik, Stechly, Kaya, Kambhampati, Subbarao
The ability to plan a course of action that achieves a desired state of affairs has long been considered a core competence of intelligent agents and has been an integral part of AI research since its inception. With the advent of large language models (LLMs), there has been considerable interest in the question of whether or not they possess such planning abilities. PlanBench, an extensible benchmark we developed in 2022, soon after the release of GPT3, has remained an important tool for evaluating the planning abilities of LLMs. Despite the slew of new private and open source LLMs since GPT3, progress on this benchmark has been surprisingly slow. OpenAI claims that their recent o1 (Strawberry) model has been specifically constructed and trained to escape the normal limitations of autoregressive LLMs--making it a new kind of model: a Large Reasoning Model (LRM). Using this development as a catalyst, this paper takes a comprehensive look at how well current LLMs and new LRMs do on PlanBench. As we shall see, while o1's performance is a quantum improvement on the benchmark, outpacing the competition, it is still far from saturating it. This improvement also brings to the fore questions about accuracy, efficiency, and guarantees which must be considered before deploying such systems.
- North America > United States > Arizona (0.04)
- North America > United States > New York (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- Asia > Middle East > Jordan (0.04)
DKPROMPT: Domain Knowledge Prompting Vision-Language Models for Open-World Planning
Zhang, Xiaohan, Altaweel, Zainab, Hayamizu, Yohei, Ding, Yan, Amiri, Saeid, Yang, Hao, Kaminski, Andy, Esselink, Chad, Zhang, Shiqi
Prompting foundation models such as large language models (LLMs) and vision-language models (VLMs) requires extensive domain knowledge and manual efforts, resulting in the so-called "prompt engineering" problem. To improve the performance of foundation models, one can provide examples explicitly [1] or implicitly [2], or encourage intermediate reasoning steps [3, 4]. Despite all the efforts, their performance in long-horizon reasoning tasks is still limited. Classical planning methods, including those defined by Planning Domain Definition Language (PDDL), are strong in ensuring the soundness, completeness and efficiency in planning tasks [5]. However, those classical planners rely on predefined states and actions, and do not perform well in open-world scenarios. We aim to enjoy the openness of VLMs in scene understanding while retaining the strong long-horizon reasoning capabilities of classical planners. Our key idea is to extract domain knowledge from classical planners for prompting VLMs towards enabling classical planners that are visually grounded and responsive to open-world situations. Given the natural connection between planning symbols and human language, this paper investigates how pre-trained VLMs can assist the robot in realizing symbolic plans generated by classical planners, while avoiding the engineering efforts of checking the outcomes of each action.
- North America > United States > New York > Broome County > Binghamton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Asia > Japan > Shikoku > Kagawa Prefecture > Takamatsu (0.04)
- Research Report (1.00)
- Overview (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Settings
Karia, Rushang, Verma, Pulkit, Speranzon, Alberto, Srivastava, Siddharth
This paper introduces a new approach for continual planning and model learning in non-stationary stochastic environments expressed using relational representations. Such capabilities are essential for the deployment of sequential decision-making systems in the uncertain, constantly evolving real world. Working in such practical settings with unknown (and non-stationary) transition systems and changing tasks, the proposed framework models gaps in the agent's current state of knowledge and uses them to conduct focused, investigative explorations. Data collected using these explorations is used for learning generalizable probabilistic models for solving the current task despite continual changes in the environment dynamics. Empirical evaluations on several benchmark domains show that this approach significantly outperforms planning and RL baselines in terms of sample complexity in non-stationary settings. Theoretical results show that the system reverts to exhibit desirable convergence properties when stationarity holds.
- North America > United States > New York (0.04)
- North America > United States > Arizona (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
NeSIG: A Neuro-Symbolic Method for Learning to Generate Planning Problems
Núñez-Molina, Carlos, Mesejo, Pablo, Fernández-Olivares, Juan
In the field of Automated Planning there is often the need for a set of planning problems from a particular domain, e.g., to be used as training data for Machine Learning or as benchmarks in planning competitions. In most cases, these problems are created either by hand or by a domain-specific generator, putting a burden on the human designers. In this paper we propose NeSIG, to the best of our knowledge the first domain-independent method for automatically generating planning problems that are valid, diverse and difficult to solve. We formulate problem generation as a Markov Decision Process and train two generative policies with Deep Reinforcement Learning to generate problems with the desired properties. We conduct experiments on several classical domains, comparing our method with handcrafted domain-specific generators that generate valid and diverse problems but do not optimize difficulty. The results show NeSIG is able to automatically generate valid problems of greater difficulty than the competitor approaches, while maintaining good diversity.
Task planning in robotics
Suppose we have a robot in a simple world like the one below. Let's consider commanding our robot to perform a task such as "take the apple from the shelf and put it on the table". Simple task planning example world: A robot can move between a finite set of locations, and can pick and place objects at those locations. I would argue we humans have pretty good intuition for how a robot could achieve this task. We could describe what the robot should do by breaking the solution down into individual actions.
On Model Reconciliation: How to Reconcile When Robot Does not Know Human's Model?
The Model Reconciliation Problem (MRP) was introduced to address issues in explainable AI planning. A solution to a MRP is an explanation for the differences between the models of the human and the planning agent (robot). Most approaches to solving MRPs assume that the robot, who needs to provide explanations, knows the human model. This assumption is not always realistic in several situations (e.g., the human might decide to update her model and the robot is unaware of the updates). In this paper, we propose a dialog-based approach for computing explanations of MRPs under the assumptions that (i) the robot does not know the human model; (ii) the human and the robot share the set of predicates of the planning domain and their exchanges are about action descriptions and fluents' values; (iii) communication between the parties is perfect; and (iv) the parties are truthful. A solution of a MRP is computed through a dialog, defined as a sequence of rounds of exchanges, between the robot and the human. In each round, the robot sends a potential explanation, called proposal, to the human who replies with her evaluation of the proposal, called response. We develop algorithms for computing proposals by the robot and responses by the human and implement these algorithms in a system that combines imperative means with answer set programming using the multi-shot feature of clingo.
- North America > United States > New Mexico (0.04)
- Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)