Goto

Collaborating Authors

 optimal path






Optimize Planning Heuristics to Rank, not to Estimate Cost-to-Goal

Neural Information Processing Systems

Figure 1: Problem instance where perfect heuristic is not strictly optimally efficient with GBFS. However, the path (A, C,D, E) has cost 10 instead of 11 . Then h is a perfect ranking for GBFS on Γ. Proof. We carry the proof by induction with respect to the number of expanded states. Let's now make the induction step and assume the theorem holds for the first A 0 B 1 C 1 D 2 A 1 1 9 9 1 Figure 2: Problem instance where optimally efficient heuristic does not exists for GBFS.



Appendix to " GraphMP: Graph Neural Network-based Motion Planning with Efficient Graph Search "

Neural Information Processing Systems

The overall network architecture is shown in Figure 1. This work was done when the author was with Rutgers University. The overall network architecture is shown in Figure 1. We also apply the ReLU activation after its first and second layers. Empirical evaluations show that NHE exhibits admissibility and consistency.


Universal Hirschberg for Width Bounded Dynamic Programs

Nye, Logan

arXiv.org Artificial Intelligence

Hirschberg's algorithm (1975) reduces the space complexity for the longest common subsequence problem from $O(N^2)$ to $O(N)$ via recursive midpoint bisection on a grid dynamic program (DP). We show that the underlying idea generalizes to a broad class of dynamic programs with local dependencies on directed acyclic graphs (DP DAGs). Modeling a DP as deterministic time evolution over a topologically ordered DAG with frontier width $ω$ and bounded in-degree, and assuming a max-type semiring with deterministic tie breaking, we prove that in a standard offline random-access model any such DP admits deterministic traceback in space $O(ω\log T + (\log T)^{O(1)})$ cells over a fixed finite alphabet, where $T$ is the number of states. Our construction replaces backward dynamic programs by forward-only recomputation and organizes the time order into a height-compressed recursion tree whose nodes expose small "middle frontiers'' across which every optimal path must pass. The framework yields near-optimal traceback bounds for asymmetric and banded sequence alignment, one-dimensional recurrences, and dynamic-programming formulations on graphs of bounded pathwidth. We also show that an $Ω(ω)$ space term (in bits) is unavoidable in forward single-pass models and discuss conjectured $\sqrt{T}$-type barriers in streaming settings, supporting the view that space-efficient traceback is a structural property of width-bounded DP DAGs rather than a peculiarity of grid-based algorithms.


A $1000\times$ Faster LLM-enhanced Algorithm For Path Planning in Large-scale Grid Maps

Zeng, Junlin, Zhang, Xin, Zhao, Xiang, Pan, Yan

arXiv.org Artificial Intelligence

Path planning in grid maps, arising from various applications, has garnered significant attention. Existing methods, such as A*, Dijkstra, and their variants, work well for small-scale maps but fail to address large-scale ones due to high search time and memory consumption. Recently, Large Language Models (LLMs) have shown remarkable performance in path planning but still suffer from spatial illusion and poor planning performance. Among all the works, LLM-A* \cite{meng2024llm} leverages LLM to generate a series of waypoints and then uses A* to plan the paths between the neighboring waypoints. In this way, the complete path is constructed. However, LLM-A* still suffers from high computational time for large-scale maps. To fill this gap, we conducted a deep investigation into LLM-A* and found its bottleneck, resulting in limited performance. Accordingly, we design an innovative LLM-enhanced algorithm, abbr. as iLLM-A*. iLLM-A* includes 3 carefully designed mechanisms, including the optimization of A*, an incremental learning method for LLM to generate high-quality waypoints, and the selection of the appropriate waypoints for A* for path planning. Finally, a comprehensive evaluation on various grid maps shows that, compared with LLM-A*, iLLM-A* \textbf{1) achieves more than $1000\times$ speedup on average, and up to $2349.5\times$ speedup in the extreme case, 2) saves up to $58.6\%$ of the memory cost, 3) achieves both obviously shorter path length and lower path length standard deviation.}


Predictive Safety Shield for Dyna-Q Reinforcement Learning

Pin, Jin, Hanna, Krasowski, Elena, Vanneaux

arXiv.org Artificial Intelligence

Obtaining safety guarantees for reinforcement learning is a major challenge to achieve applicability for real-world tasks. Safety shields extend standard reinforcement learning and achieve hard safety guarantees. However, existing safety shields commonly use random sampling of safe actions or a fixed fallback controller, therefore disregarding future performance implications of different safe actions. In this work, we propose a predictive safety shield for model-based reinforcement learning agents in discrete space. Our safety shield updates the Q-function locally based on safe predictions, which originate from a safe simulation of the environment model. This shielding approach improves performance while maintaining hard safety guarantees. Our experiments on gridworld environments demonstrate that even short prediction horizons can be sufficient to identify the optimal path. We observe that our approach is robust to distribution shifts, e.g., between simulation and reality, without requiring additional training.