Search
Parallelizing Multi-objective A* Search
Ahmadi, Saman, Sturtevant, Nathan R., Raith, Andrea, Harabor, Daniel, Jalili, Mahdi
The Multi-objective Shortest Path (MOSP) problem is a classic network optimization problem that aims to find all Pareto-optimal paths between two points in a graph with multiple edge costs. Recent studies on multi-objective search with A* (MOA*) have demonstrated superior performance in solving difficult MOSP instances. This paper presents a novel search framework that allows efficient parallelization of MOA* with different objective orders. The framework incorporates a unique upper bounding strategy that helps the search reduce the problem's dimensionality to one in certain cases. Experimental results demonstrate that the proposed framework can enhance the performance of recent A*-based solutions, with the speed-up proportional to the problem dimension.
The Value of Goal Commitment in Planning
Pozanco, Alberto, Morales, Marianela, Borrajo, Daniel, Veloso, Manuela
In this paper, we revisit the concept of goal commitment from early planners in the presence of current forward chaining heuristic planners. We present a compilation that extends the original planning task with commit actions that enforce the persistence of specific goals once achieved, thereby committing to them in the search sub-tree. This approach imposes a specific goal achievement order in parts of the search tree, potentially introducing dead-end states. This can reduce search effort if the goal achievement order is correct. Otherwise, the search algorithm can expand nodes in the open list where goals do not persist. Experimental results demonstrate that the reformulated tasks suit state-of-the-art agile planners, enabling them to find better
Local Look-Ahead Guidance via Verifier-in-the-Loop for Automated Theorem Proving
Rajaee, Sara, Pratik, Kumar, Cesa, Gabriele, Behboodi, Arash
The most promising recent methods for AI reasoning require applying variants of reinforcement learning (RL) either on rolled out trajectories from the model, even for the step-wise rewards, or large quantities of human annotated trajectory data. The reliance on the rolled-out trajectory renders the compute cost and time prohibitively high. In particular, the correctness of a reasoning trajectory can typically only be judged at its completion, leading to sparse rewards in RL or requiring expensive synthetic data generation in expert iteration-like methods. In this work, we focus on the Automatic Theorem Proving (ATP) task and propose a novel verifier-in-the-loop design, which unlike existing approaches that leverage feedback on the entire reasoning trajectory, employs an automated verifier to give intermediate feedback at each step of the reasoning process. Using Lean as the verifier, we empirically show that the step-by-step local verification produces a global improvement in the model's reasoning accuracy and efficiency.
Fair Play in the Fast Lane: Integrating Sportsmanship into Autonomous Racing Systems
Huang, Zhenmin, Hao, Ce, Zhan, Wei, Ma, Jun, Tomizuka, Masayoshi
Autonomous racing has gained significant attention as a platform for high-speed decision-making and motion control. While existing methods primarily focus on trajectory planning and overtaking strategies, the role of sportsmanship in ensuring fair competition remains largely unexplored. In human racing, rules such as the one-motion rule and the enough-space rule prevent dangerous and unsportsmanlike behavior. However, autonomous racing systems often lack mechanisms to enforce these principles, potentially leading to unsafe maneuvers. This paper introduces a bi-level game-theoretic framework to integrate sportsmanship (SPS) into versus racing. At the high level, we model racing intentions using a Stackelberg game, where Monte Carlo Tree Search (MCTS) is employed to derive optimal strategies. At the low level, vehicle interactions are formulated as a Generalized Nash Equilibrium Problem (GNEP), ensuring that all agents follow sportsmanship constraints while optimizing their trajectories. Simulation results demonstrate the effectiveness of the proposed approach in enforcing sportsmanship rules while maintaining competitive performance. We analyze different scenarios where attackers and defenders adhere to or disregard sportsmanship rules and show how knowledge of these constraints influences strategic decision-making. This work highlights the importance of balancing competition and fairness in autonomous racing and provides a foundation for developing ethical and safe AI-driven racing systems.
Minimax Optimality of the Probability Flow ODE for Diffusion Models
Score-based diffusion models have become a foundational paradigm for modern generative modeling, demonstrating exceptional capability in generating samples from complex high-dimensional distributions. Despite the dominant adoption of probability flow ODE-based samplers in practice due to their superior sampling efficiency and precision, rigorous statistical guarantees for these methods have remained elusive in the literature. This work develops the first end-to-end theoretical framework for deterministic ODE-based samplers that establishes near-minimax optimal guarantees under mild assumptions on target data distributions. Specifically, focusing on subgaussian distributions with $\beta$-H\"older smooth densities for $\beta\leq 2$, we propose a smooth regularized score estimator that simultaneously controls both the $L^2$ score error and the associated mean Jacobian error. Leveraging this estimator within a refined convergence analysis of the ODE-based sampling process, we demonstrate that the resulting sampler achieves the minimax rate in total variation distance, modulo logarithmic factors. Notably, our theory comprehensively accounts for all sources of error in the sampling process and does not require strong structural conditions such as density lower bounds or Lipschitz/smooth scores on target distributions, thereby covering a broad range of practical data distributions.
Learning to Search Effective Example Sequences for In-Context Learning
Gao, Xiang, Sinha, Ankita, Das, Kamalika
Large language models (LLMs) demonstrate impressive few-shot learning capabilities, but their performance varies widely based on the sequence of in-context examples. Key factors influencing this include the sequence's length, composition, and arrangement, as well as its relation to the specific query. Existing methods often tackle these factors in isolation, overlooking their interdependencies. Moreover, the extensive search space for selecting optimal sequences complicates the development of a holistic approach. In this work, we introduce Beam Search-based Example Sequence Constructor (BESC), a novel method for learning to construct optimal example sequences. BESC addresses all key factors involved in sequence selection by considering them jointly during inference, while incrementally building the sequence. This design enables the use of beam search to significantly reduce the complexity of the search space. Experiments across various datasets and language models show notable improvements in performance.
Large Neighborhood Search and Bitmask Dynamic Programming for Wireless Mobile Charging Electric Vehicle Routing Problems in Medical Transportation
Zhao, Jingyi, Yang, Haoxiang, Liu, Yang
The transition to electric vehicles (EVs) is critical to achieving sustainable transportation, but challenges such as limited driving range and insufficient charging infrastructure have hindered the widespread adoption of EVs, especially in time-sensitive logistics such as medical transportation. This paper presents a new model to break through this barrier by combining wireless mobile charging technology with optimization. We propose the Wireless Mobile Charging Electric Vehicle Routing Problem (WMC-EVRP), which enables Medical Transportation Electric Vehicles (MTEVs) to be charged while traveling via Mobile Charging Carts (MCTs). This eliminates the time wastage of stopping for charging and ensures uninterrupted operation of MTEVs for such time-sensitive transportation problems. However, in this problem, the decisions of these two types of heterogeneous vehicles are coupled with each other, which greatly increases the difficulty of vehicle routing optimizations. To address this complex problem, we develop a mathematical model and a tailored meta-heuristic algorithm that combines Bit Mask Dynamic Programming (BDP) and Large Neighborhood Search (LNS). The BDP approach efficiently optimizes charging strategies, while the LNS framework utilizes custom operators to optimize the MTEV routes under capacity and synchronization constraints. Our approach outperforms traditional solvers in providing solutions for medium and large instances. Using actual hospital locations in Singapore as data, we validated the practical applicability of the model through extensive experiments and provided important insights into minimizing costs and ensuring the timely delivery of healthcare services.
Generative Models in Decision Making: A Survey
Li, Yinchuan, Shao, Xinyu, Zhang, Jianping, Wang, Haozhi, Brunswic, Leo Maxime, Zhou, Kaiwen, Dong, Jiqian, Guo, Kaiyang, Li, Xiu, Chen, Zhitang, Wang, Jun, Hao, Jianye
In recent years, the exceptional performance of generative models in generative tasks has sparked significant interest in their integration into decision-making processes. Due to their ability to handle complex data distributions and their strong model capacity, generative models can be effectively incorporated into decision-making systems by generating trajectories that guide agents toward high-reward state-action regions or intermediate sub-goals. This paper presents a comprehensive review of the application of generative models in decision-making tasks. We classify seven fundamental types of generative models: energy-based models, generative adversarial networks, variational autoencoders, normalizing flows, diffusion models, generative flow networks, and autoregressive models. Regarding their applications, we categorize their functions into three main roles: controllers, modelers and optimizers, and discuss how each role contributes to decision-making. Furthermore, we examine the deployment of these models across five critical real-world decision-making scenarios. Finally, we summarize the strengths and limitations of current approaches and propose three key directions for advancing next-generation generative directive models: high-performance algorithms, large-scale generalized decision-making models, and self-evolving and adaptive models.
($\boldsymbol{\theta}_l, \boldsymbol{\theta}_u$)-Parametric Multi-Task Optimization: Joint Search in Solution and Infinite Task Spaces
Wei, Tingyang, Liu, Jiao, Gupta, Abhishek, Tan, Puay Siew, Ong, Yew-Soon
Multi-task optimization is typically characterized by a fixed and finite set of optimization tasks. The present paper relaxes this condition by considering a non-fixed and potentially infinite set of optimization tasks defined in a parameterized, continuous and bounded task space. We refer to this unique problem setting as parametric multi-task optimization (PMTO). Assuming the bounds of the task parameters to be ($\boldsymbol{\theta}_l$, $\boldsymbol{\theta}_u$), a novel ($\boldsymbol{\theta}_l$, $\boldsymbol{\theta}_u$)-PMTO algorithm is crafted to enable joint search over tasks and their solutions. This joint search is supported by two approximation models: (1) for mapping solutions to the objective spaces of all tasks, which provably accelerates convergence by acting as a conduit for inter-task knowledge transfers, and (2) for probabilistically mapping tasks to the solution space, which facilitates evolutionary exploration of under-explored regions of the task space. At the end of a full ($\boldsymbol{\theta}_l$, $\boldsymbol{\theta}_u$)-PMTO run, the acquired models enable rapid identification of optimized solutions for any task lying within the specified bounds. This outcome is validated on both synthetic test problems and practical case studies, with the significant real-world applicability of PMTO shown towards fast reconfiguration of robot controllers under changing task conditions. The potential of PMTO to vastly speedup the search for solutions to minimax optimization problems is also demonstrated through an example in robust engineering design.
Multi-Robot System for Cooperative Exploration in Unknown Environments: A Survey
Wang, Chuqi, Yu, Chao, Xu, Xin, Gao, Yuman, Yang, Xinyi, Tang, Wenhao, Yu, Shu'ang, Chen, Yinuo, Gao, Feng, Jian, ZhuoZhu, Chen, Xinlei, Gao, Fei, Zhou, Boyu, Wang, Yu
With the advancement of multi-robot technology, cooperative exploration tasks have garnered increasing attention. This paper presents a comprehensive review of multi-robot cooperative exploration systems. First, we review the evolution of robotic exploration and introduce a modular research framework tailored for multi-robot cooperative exploration. Based on this framework, we systematically categorize and summarize key system components. As a foundational module for multi-robot exploration, the localization and mapping module is primarily introduced by focusing on global and relative pose estimation, as well as multi-robot map merging techniques. The cooperative motion module is further divided into learning-based approaches and multi-stage planning, with the latter encompassing target generation, task allocation, and motion planning strategies. Given the communication constraints of real-world environments, we also analyze the communication module, emphasizing how robots exchange information within local communication ranges and under limited transmission capabilities. Finally, we discuss the challenges and future research directions for multi-robot cooperative exploration in light of real-world trends. This review aims to serve as a valuable reference for researchers and practitioners in the field.