Planning & Scheduling
Google can save locations you screenshot in Maps to help with travel planning
It might be around that time of year when you're starting to figure out your summer vacation plans. Google has revealed some new features that can help with that, including a handy AI-powered one for Maps. If you turn on the new screenshot list, Gemini can automatically recognize locations that are mentioned in screenshots you take in the app. You can then save the places you're interested in to a list. These saved spots will appear on the map, and you can share the list with your travel companions.
Neuro-Symbolic Imitation Learning: Discovering Symbolic Abstractions for Skill Learning
Keller, Leon, Tanneberg, Daniel, Peters, Jan
Imitation learning is a popular method for teaching robots new behaviors. However, most existing methods focus on teaching short, isolated skills rather than long, multi-step tasks. To bridge this gap, imitation learning algorithms must not only learn individual skills but also an abstract understanding of how to sequence these skills to perform extended tasks effectively. This paper addresses this challenge by proposing a neuro-symbolic imitation learning framework. Using task demonstrations, the system first learns a symbolic representation that abstracts the low-level state-action space. The learned representation decomposes a task into easier subtasks and allows the system to leverage symbolic planning to generate abstract plans. Subsequently, the system utilizes this task decomposition to learn a set of neural skills capable of refining abstract plans into actionable robot commands. Experimental results in three simulated robotic environments demonstrate that, compared to baselines, our neuro-symbolic approach increases data efficiency, improves generalization capabilities, and facilitates interpretability.
Decremental Dynamics Planning for Robot Navigation
Lu, Yuanjie, Xu, Tong, Wang, Linji, Hawes, Nick, Xiao, Xuesu
-- Most, if not all, robot navigation systems employ a decomposed planning framework that includes global and local planning. T o trade-off onboard computation and plan quality, current systems have to limit all robot dynamics considerations only within the local planner, while leveraging an extremely simplified robot representation (e.g., a point-mass holonomic model without dynamics) in the global level. However, such an artificial decomposition based on either full or zero consideration of robot dynamics can lead to gaps between the two levels, e.g., a global path based on a holonomic point-mass model may not be realizable by a non-holonomic robot, especially in highly constrained obstacle environments. T o validate the effectiveness of this paradigm, we augment three different planners with DDP and show overall improved planning performance. Navigation is a fundamental capability for autonomous mobile robots, enabling them to effectively traverse complex environments without collisions. As the demand for robotic systems grows across various domains, such as industrial automation, search and rescue, and autonomous delivery, the need for efficient and robust navigation strategies becomes increasingly important. Traditionally, most robot navigation systems adopt a hierarchical planning framework, decomposing the planning process into global and local planning.
New SDF command will be key for contingency planning, Australian commander says
The Self-Defense Forces' newly launched Joint Operations Command (JJOC) will play a critical role in coordinating responses with allies and partners to a broad spectrum of potential crises, the chief of a similar Australian military command established in 2004 told The Japan Times. "I see this as another milestone in an ever-evolving and strengthening relationship with Japan," Vice Adm. Justin Jones, the Australian Defence Force's (ADF) chief of joint operations, said in an exclusive interview Monday, noting that the new structure will not only enable direct communication with similar commands in partner countries but also result in greater speed and efficiency when coordinating and conducting joint operations. "Without a doubt, the new SDF command will be enormously important for contingency planning," he said.
Synthesizing world models for bilevel planning
Ahmed, Zergham, Tenenbaum, Joshua B., Bates, Christopher J., Gershman, Samuel J.
Modern reinforcement learning (RL) systems have demonstrated remarkable capabilities in complex environments, such as video games. However, they still fall short of achieving human-like sample efficiency and adaptability when learning new domains. Theory-based reinforcement learning (TBRL) is an algorithmic framework specifically designed to address this gap. Modeled on cognitive theories, TBRL leverages structured, causal world models - "theories" - as forward simulators for use in planning, generalization and exploration. Although current TBRL systems provide compelling explanations of how humans learn to play video games, they face several technical limitations: their theory languages are restrictive, and their planning algorithms are not scalable. To address these challenges, we introduce TheoryCoder, an instantiation of TBRL that exploits hierarchical representations of theories and efficient program synthesis methods for more powerful learning and planning. TheoryCoder equips agents with general-purpose abstractions (e.g., "move to"), which are then grounded in a particular environment by learning a low-level transition model (a Python program synthesized from observations by a large language model). A bilevel planning algorithm can exploit this hierarchical structure to solve large domains. We demonstrate that this approach can be successfully applied to diverse and challenging grid-world games, where approaches based on directly synthesizing a policy perform poorly. Ablation studies demonstrate the benefits of using hierarchical abstractions.
Reinforcement Learning for Adaptive Planner Parameter Tuning: A Perspective on Hierarchical Architecture
Wangtao, Lu, Yufei, Wei, Jiadong, Xu, Wenhao, Jia, Liang, Li, Rong, Xiong, Yue, Wang
Automatic parameter tuning methods for planning algorithms, which integrate pipeline approaches with learning-based techniques, are regarded as promising due to their stability and capability to handle highly constrained environments. While existing parameter tuning methods have demonstrated considerable success, further performance improvements require a more structured approach. In this paper, we propose a hierarchical architecture for reinforcement learning-based parameter tuning. The architecture introduces a hierarchical structure with low-frequency parameter tuning, mid-frequency planning, and high-frequency control, enabling concurrent enhancement of both upper-layer parameter tuning and lower-layer control through iterative training. Experimental evaluations in both simulated and real-world environments show that our method surpasses existing parameter tuning approaches. Furthermore, our approach achieves first place in the Benchmark for Autonomous Robot Navigation (BARN) Challenge.
LLMs as Planning Modelers: A Survey for Leveraging Large Language Models to Construct Automated Planning Models
Tantakoun, Marcus, Zhu, Xiaodan, Muise, Christian
Large Language Models (LLMs) excel in various natural language tasks but often struggle with long-horizon planning problems requiring structured reasoning. This limitation has drawn interest in integrating neuro-symbolic approaches within the Automated Planning (AP) and Natural Language Processing (NLP) communities. However, identifying optimal AP deployment frameworks can be daunting. This paper aims to provide a timely survey of the current research with an in-depth analysis, positioning LLMs as tools for extracting and refining planning models to support reliable AP planners. By systematically reviewing the current state of research, we highlight methodologies, and identify critical challenges and future directions, hoping to contribute to the joint research on NLP and Automated Planning.
Reachability-Guaranteed Optimal Control for the Interception of Dynamic Targets under Uncertainty
Faraci, Tommaso, Lampariello, Roberto
Intercepting dynamic objects in uncertain environments involves a significant unresolved challenge in modern robotic systems. Current control approaches rely solely on estimated information, and results lack guarantees of robustness and feasibility. In this work, we introduce a novel method to tackle the interception of targets whose motion is affected by known and bounded uncertainty. Our approach introduces new techniques of reachability analysis for rigid bodies, leveraged to guarantee feasibility of interception under uncertain conditions. We then propose a Reachability-Guaranteed Optimal Control Problem, ensuring robustness and guaranteed reachability to a target set of configurations. We demonstrate the methodology in the case study of an interception maneuver of a tumbling target in space.
Strength Estimation and Human-Like Strength Adjustment in Games
Chen, Chun Jung, Shih, Chung-Chin, Wu, Ti-Rong
Strength estimation and adjustment are crucial in designing human-AI interactions, particularly in games where AI surpasses human players. This paper introduces a novel strength system, including a strength estimator (SE) and an SE-based Monte Carlo tree search, denoted as SE-MCTS, which predicts strengths from games and offers different playing strengths with human styles. The strength estimator calculates strength scores and predicts ranks from games without direct human interaction. SE-MCTS utilizes the strength scores in a Monte Carlo tree search to adjust playing strength and style. We first conduct experiments in Go, a challenging board game with a wide range of ranks. Our strength estimator significantly achieves over 80% accuracy in predicting ranks by observing 15 games only, whereas the previous method reached 49% accuracy for 100 games. For strength adjustment, SE-MCTS successfully adjusts to designated ranks while achieving a 51.33% accuracy in aligning to human actions, outperforming a previous stateof-the-art, with only 42.56% accuracy. To demonstrate the generality of our strength system, we further apply SE and SE-MCTS to chess and obtain consistent results. These results show a promising approach to strength estimation and adjustment, enhancing human-AI interactions in games. Artificial intelligence has achieved superhuman performance in various domains in recent years, especially in games (Silver et al., 2018; Schrittwieser et al., 2020; Vinyals et al., 2019; OpenAI et al., 2019). These achievements have raised interests within the community in exploring AI programs for human interactions, particularly in estimating human players' strengths and offering corresponding levels to increase entertainment or improve skills (Demediuk et al., 2017; Fan et al., 2019; Moon & Seo, 2020; Gusmão et al., 2015; Silva et al., 2015; Hunicke & Chapman, 2004).
OptionZero: Planning with Learned Options
Huang, Po-Wei, Peng, Pei-Chiun, Guei, Hung, Wu, Ti-Rong
Planning with options -- a sequence of primitive actions -- has been shown effective in reinforcement learning within complex environments. Previous studies have focused on planning with predefined options or learned options through expert demonstration data. Inspired by MuZero, which learns superhuman heuristics without any human knowledge, we propose a novel approach, named OptionZero. OptionZero incorporates an option network into MuZero, providing autonomous discovery of options through self-play games. Furthermore, we modify the dynamics network to provide environment transitions when using options, allowing searching deeper under the same simulation constraints. Empirical experiments conducted in 26 Atari games demonstrate that OptionZero outperforms MuZero, achieving a 131.58% improvement in mean human-normalized score. Our behavior analysis shows that OptionZero not only learns options but also acquires strategic skills tailored to different game characteristics. Our findings show promising directions for discovering and using options in planning. Our code is available at https://rlg.iis.sinica.edu.tw/papers/optionzero.