Goto

Collaborating Authors

 Planning & Scheduling


Enhancements for Real-Time Monte-Carlo Tree Search in General Video Game Playing

arXiv.org Artificial Intelligence

General Video Game Playing (GVGP) is a field of Artificial Intelligence where agents play a variety of real-time video games that are unknown in advance. This limits the use of domain-specific heuristics. Monte-Carlo Tree Search (MCTS) is a search technique for game playing that does not rely on domain-specific knowledge. This paper discusses eight enhancements for MCTS in GVGP; Progressive History, N-Gram Selection Technique, Tree Reuse, Breadth-First Tree Initialization, Loss Avoidance, Novelty-Based Pruning, Knowledge-Based Evaluations, and Deterministic Game Detection. Some of these are known from existing literature, and are either extended or introduced in the context of GVGP, and some are novel enhancements for MCTS. Most enhancements are shown to provide statistically significant increases in win percentages when applied individually. When combined, they increase the average win percentage over sixty different games from 31.0% to 48.4% in comparison to a vanilla MCTS implementation, approaching a level that is competitive with the best agents of the GVG-AI competition in 2015.


Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages

arXiv.org Artificial Intelligence

Many recent works have explored using language models for planning problems. One line of research focuses on translating natural language descriptions of planning tasks into structured planning languages, such as the planning domain definition language (PDDL). While this approach is promising, accurately measuring the quality of generated PDDL code continues to pose significant challenges. First, generated PDDL code is typically evaluated using planning validators that check whether the problem can be solved with a planner. This method is insufficient because a language model might generate valid PDDL code that does not align with the natural language description of the task. Second, existing evaluation sets often have natural language descriptions of the planning task that closely resemble the ground truth PDDL, reducing the challenge of the task. To bridge this gap, we introduce \benchmarkName, a benchmark designed to evaluate language models' ability to generate PDDL code from natural language descriptions of planning tasks. We begin by creating a PDDL equivalence algorithm that rigorously evaluates the correctness of PDDL code generated by language models by flexibly comparing it against a ground truth PDDL. Then, we present a dataset of $132,037$ text-to-PDDL pairs across 13 different tasks, with varying levels of difficulty. Finally, we evaluate several API-access and open-weight language models that reveal this task's complexity. For example, $87.6\%$ of the PDDL problem descriptions generated by GPT-4o are syntactically parseable, $82.2\%$ are valid, solve-able problems, but only $35.1\%$ are semantically correct, highlighting the need for a more rigorous benchmark for this problem.


Pr\"avention und Beseitigung von Fehlerursachen im Kontext von unbemannten Fahrzeugen

arXiv.org Artificial Intelligence

Mobile robots, becoming increasingly autonomous, are capable of operating in diverse and unknown environments. This flexibility allows them to fulfill goals independently and adapting their actions dynamically without rigidly predefined control codes. However, their autonomous behavior complicates guaranteeing safety and reliability due to the limited influence of a human operator to accurately supervise and verify each robot's actions. To ensure autonomous mobile robot's safety and reliability, which are aspects of dependability, methods are needed both in the planning and execution of missions for autonomous mobile robots. In this article, a twofold approach is presented that ensures fault removal in the context of mission planning and fault prevention during mission execution for autonomous mobile robots. First, the approach consists of a concept based on formal verification applied during the planning phase of missions. Second, the approach consists of a rule-based concept applied during mission execution. A use case applying the approach is presented, discussing how the two concepts complement each other and what contribution they make to certain aspects of dependability. Unbemannte Fahrzeuge sind durch zunehmende Autonomie in der Lage in unterschiedlichen unbekannten Umgebungen zu operieren. Diese Flexibilit\"at erm\"oglicht es ihnen Ziele eigenst\"andig zu erf\"ullen und ihre Handlungen dynamisch anzupassen ohne starr vorgegebenen Steuerungscode. Allerdings erschwert ihr autonomes Verhalten die Gew\"ahrleistung von Sicherheit und Zuverl\"assigkeit, bzw. der Verl\"asslichkeit, da der Einfluss eines menschlichen Bedieners zur genauen \"Uberwachung und Verifizierung der Aktionen jedes Roboters begrenzt ist. Daher werden Methoden sowohl in der Planung als auch in der Ausf\"uhrung von Missionen f\"ur unbemannte Fahrzeuge ben\"otigt, um die Sicherheit und Zuverl\"assigkeit dieser Fahrzeuge zu gew\"ahrleisten. In diesem Artikel wird ein zweistufiger Ansatz vorgestellt, der eine Fehlerbeseitigung w\"ahrend der Missionsplanung und eine Fehlerpr\"avention w\"ahrend der Missionsausf\"uhrung f\"ur unbemannte Fahrzeuge sicherstellt. Die Fehlerbeseitigung basiert auf formaler Verifikation, die w\"ahrend der Planungsphase der Missionen angewendet wird. Die Fehlerpr\"avention basiert auf einem regelbasierten Konzept, das w\"ahrend der Missionsausf\"uhrung angewendet wird. Der Ansatz wird an einem Beispiel angewendet und es wird diskutiert, wie die beiden Konzepte sich erg\"anzen und welchen Beitrag sie zu verschiedenen Aspekten der Verl\"asslichkeit leisten.


PROC2PDDL: Open-Domain Planning Representations from Texts

arXiv.org Artificial Intelligence

Planning in a text-based environment continues to be a major challenge for AI systems. Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations. Using this dataset, we evaluate state-of-the-art models on defining the preconditions and effects of actions. We show that Proc2PDDL is highly challenging, with GPT-3.5's success rate close to 0% and GPT-4's around 35%. Our analysis shows both syntactic and semantic errors, indicating LMs' deficiency in both generating domain-specific prgorams and reasoning about events. We hope this analysis and dataset helps future progress towards integrating the best of LMs and formal planning.


MARLIN: A Cloud Integrated Robotic Solution to Support Intralogistics in Retail

arXiv.org Artificial Intelligence

In this paper, we present the service robot MARLIN and its integration with the K4R platform, a cloud system for complex AI applications in retail. At its core, this platform contains so-called semantic digital twins, a semantically annotated representation of the retail store. MARLIN continuously exchanges data with the K4R platform, improving the robot's capabilities in perception, autonomous navigation, and task planning. We exploit these capabilities in a retail intralogistics scenario, specifically by assisting store employees in stocking shelves. We demonstrate that MARLIN is able to update the digital representation of the retail store by detecting and classifying obstacles, autonomously planning and executing replenishment missions, adapting to unforeseen changes in the environment, and interacting with store employees. Experiments are conducted in simulation, in a laboratory environment, and in a real store. We also describe and evaluate a novel algorithm for autonomous navigation of articulated tractor-trailer systems. The algorithm outperforms the manufacturer's proprietary navigation approach and improves MARLIN's navigation capabilities in confined spaces.


Revisi\'on de M\'etodos de Planificaci\'on de Camino de Cobertura para Entornos Agr\'icolas

arXiv.org Artificial Intelligence

The use of an efficient coverage planning method is key for autonomous navigation in agricultural environments, where a robot must cover large areas of crops. This paper generally reviews the current state of the art of coverage path planning methods. Two widely used techniques applicable to agricultural environments are described in detail. The first consists of breaking down a complex field with obstacles into simpler subregions known as cells, to subsequently generate a coverage pattern in each of them. The second analyzes spaces composed of parallel strips through which the robot must circulate, in order to find the optimal order of visiting strips that minimizes the total distance traveled. Additionally, the combination of both techniques is discussed in order to obtain a more efficient global coverage plan. This analysis was conceived to be implemented with the soybean crop weeding robot developed at CIFASIS (CONICET-UNR).


Universal Plans: One Action Sequence to Solve Them All!

arXiv.org Artificial Intelligence

This paper introduces the notion of a universal plan, which when executed, is guaranteed to solve all planning problems in a category, regardless of the obstacles, initial state, and goal set. Such plans are specified as a deterministic sequence of actions that are blindly applied without any sensor feedback. Thus, they can be considered as pure exploration in a reinforcement learning context, and we show that with basic memory requirements, they even yield asymptotically optimal plans. Building upon results in number theory and theory of automata, we provide universal plans both for discrete and continuous (motion) planning and prove their (semi)completeness. The concepts are applied and illustrated through simulation studies, and several directions for future research are sketched.


VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for VLMs

arXiv.org Artificial Intelligence

Vision language models (VLMs) are an exciting emerging class of language models (LMs) that have merged classic LM capabilities with those of image processing systems. However, the ways that these capabilities combine are not always intuitive and warrant direct investigation. One understudied capability in VLMs is visual spatial planning--the ability to comprehend the spatial arrangements of objects and devise action plans to achieve desired outcomes in visual scenes. In our study, we introduce VSP, a benchmark that 1) evaluates the spatial planning capability in these models in general, and 2) breaks down the visual planning task into finer-grained sub-tasks, including perception and reasoning, and measure the LMs capabilities in these sub-tasks. Our evaluation shows that both open-source and private VLMs fail to generate effective plans for even simple spatial planning tasks. Evaluations on the fine-grained analytical tasks further reveal fundamental deficiencies in the models' visual perception and bottlenecks in reasoning abilities, explaining their worse performance in the general spatial planning tasks. Our work illuminates future directions for improving VLMs' abilities in spatial planning. Our benchmark is publicly available at https://github.com/UCSB-NLP-Chang/


UAV Trajectory Planning with Path Processing

arXiv.org Artificial Intelligence

This paper examines the influence of initial guesses on trajectory planning for Unmanned Aerial Vehicles (UAVs) formulated in terms of Optimal Control Problem (OCP). The OCP is solved numerically using the Pseudospectral collocation method. Our approach leverages a path identified through Lazy Theta* and incorporates known constraints and a model of the UAV's behavior for the initial guess. Our findings indicate that a suitable initial guess has a beneficial influence on the planned trajectory. They also suggest promising directions for future research.


Automated radiotherapy treatment planning guided by GPT-4Vision

arXiv.org Artificial Intelligence

Radiotherapy treatment planning is a time-consuming and potentially subjective process that requires the iterative adjustment of model parameters to balance multiple conflicting objectives. Recent advancements in large foundation models offer promising avenues for addressing the challenges in planning and clinical decision-making. This study introduces GPT-RadPlan, a fully automated treatment planning framework that harnesses prior radiation oncology knowledge encoded in multi-modal large language models, such as GPT-4Vision (GPT-4V) from OpenAI. GPT-RadPlan is made aware of planning protocols as context and acts as an expert human planner, capable of guiding a treatment planning process. Via in-context learning, we incorporate clinical protocols for various disease sites as prompts to enable GPT-4V to acquire treatment planning domain knowledge. The resulting GPT-RadPlan agent is integrated into our in-house inverse treatment planning system through an API. The efficacy of the automated planning system is showcased using multiple prostate and head & neck cancer cases, where we compared GPT-RadPlan results to clinical plans. In all cases, GPT-RadPlan either outperformed or matched the clinical plans, demonstrating superior target coverage and organ-at-risk sparing. Consistently satisfying the dosimetric objectives in the clinical protocol, GPT-RadPlan represents the first multimodal large language model agent that mimics the behaviors of human planners in radiation oncology clinics, achieving remarkable results in automating the treatment planning process without the need for additional training.