waypoint
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
- Asia > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Guangxi Province > Nanning (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.43)
NASA used Claude to plot a route for its Perseverance rover on Mars
No, the chatbot did not crash Perseverance. Since 2021, NASA's Perseverance rover has achieved a number of historic milestones, including sending back the first audio recordings from Mars . Now, nearly five years after landing on the Red Planet, it just achieved another feat. This past December, Perseverance successfully completed a route through a section of the Jezero crater plotted by Anthropic's Claude chatbot, marking the first time NASA has used a large language model to pilot the car-sized robot. Between December 8 and 10, Perseverance drove approximately 400 meters (about 437 yards) through a field of rocks on the Martian surface mapped out by Claude.
- Government > Space Agency (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
Accelerating Motion Planning via Optimal Transport
Motion planning is still an open problem for many disciplines, e.g., robotics, autonomous driving, due to their need for high computational resources that hinder real-time, efficient decision-making. A class of methods striving to provide smooth solutions is gradient-based trajectory optimization. However, those methods usually suffer from bad local minima, while for many settings, they may be inapplicable due to the absence of easy-to-access gradients of the optimization objectives. In response to these issues, we introduce Motion Planning via Optimal Transport (MPOT)---a \textit{gradient-free} method that optimizes a batch of smooth trajectories over highly nonlinear costs, even for high-dimensional tasks, while imposing smoothness through a Gaussian Process dynamics prior via the planning-as-inference perspective. To facilitate batch trajectory optimization, we introduce an original zero-order and highly-parallelizable update rule----the Sinkhorn Step, which uses the regular polytope family for its search directions. Each regular polytope, centered on trajectory waypoints, serves as a local cost-probing neighborhood, acting as a \textit{trust region} where the Sinkhorn Step ``transports'' local waypoints toward low-cost regions. We theoretically show that Sinkhorn Step guides the optimizing parameters toward local minima regions of non-convex objective functions. We then show the efficiency of MPOT in a range of problems from low-dimensional point-mass navigation to high-dimensional whole-body robot motion planning, evincing its superiority compared to popular motion planners, paving the way for new applications of optimal transport in motion planning.
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
Language-guided robotic manipulation is a challenging task that requires an embodied agent to follow abstract user instructions to accomplish various complex manipulation tasks. Previous work generally maps instructions and visual perceptions directly to low-level executable actions, neglecting the modeling of critical waypoints (e.g., key states of "close to/grab/move up" in action trajectories) in manipulation tasks.To address this issue, we propose a PImitive-driVen waypOinT-aware world model for Robotic manipulation (PIVOT-R) that focuses solely on the prediction of task-relevant waypoints. Specifically, PIVOT-R consists of a Waypoint-aware World Model (WAWM) and a lightweight action prediction module. The former performs primitive action parsing and primitive-driven waypoint prediction, while the latter focuses on decoding low-level actions. Additionally, we also design an asynchronous hierarchical executor (AHE) for PIVOT-R, which can use different execution frequencies for different modules of the model, thereby helping the model reduce computational redundancy and improve model execution efficiency. Our PIVOT-R outperforms state-of-the-art (SoTA) open-source models on the SeaWave benchmark, achieving an average relative improvement of 19.45% across four levels of instruction tasks. Moreover, compared to the synchronously executed PIVOT-R, the execution efficiency of PIVOT-R with AHE is increased by 28-fold, with only a 2.9% drop in performance. These results provide compelling evidence that our PIVOT-R can significantly improve both the performance and efficiency of robotic manipulation.
Push Smarter, Not Harder: Hierarchical RL-Diffusion Policy for Efficient Nonprehensile Manipulation
Caro, Steven, Smith, Stephen L.
Nonprehensile manipulation, such as pushing objects across cluttered environments, presents a challenging control problem due to complex contact dynamics and long-horizon planning requirements. In this work, we propose HeRD, a hierarchical reinforcement learning-diffusion policy that decomposes pushing tasks into two levels: high-level goal selection and low-level trajectory generation. We employ a high-level reinforcement learning (RL) agent to select intermediate spatial goals, and a low-level goal-conditioned diffusion model to generate feasible, efficient trajectories to reach them. This architecture combines the long-term reward maximizing behaviour of RL with the generative capabilities of diffusion models. We evaluate our method in a 2D simulation environment and show that it outperforms the state-of-the-art baseline in success rate, path efficiency, and generalization across multiple environment configurations. Our results suggest that hierarchical control with generative low-level planning is a promising direction for scalable, goal-directed nonprehensile manipulation. Code, documentation, and trained models are available: https://github.com/carosteven/HeRD.
- North America > United States (0.04)
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
C*: A Coverage Path Planning Algorithm for Unknown Environments using Rapidly Covering Graphs
Shen, Zongyuan, Wilson, James P., Gupta, Shalabh
The paper presents a novel sample-based algorithm, called C*, for real-time coverage path planning (CPP) of unknown environments. C* is built upon the concept of a Rapidly Covering Graph (RCG), which is incrementally constructed during robot navigation via progressive sampling of the search space. By using efficient sampling and pruning techniques, the RCG is constructed to be a minimum-sufficient graph, where its nodes and edges form the potential waypoints and segments of the coverage trajectory, respectively. The RCG tracks the coverage progress, generates the coverage trajectory and helps the robot to escape from the dead-end situations. To minimize coverage time, C* produces the desired back-and-forth coverage pattern, while adapting to the TSP-based optimal coverage of local isolated regions, called coverage holes, which are surrounded by obstacles and covered regions. It is analytically proven that C* provides complete coverage of unknown environments. The algorithmic simplicity and low computational complexity of C* make it easy to implement and suitable for real-time on-board applications. The performance of C* is validated by 1) extensive high-fidelity simulations and 2) laboratory experiments using an autonomous robot. C* yields near optimal trajectories, and a comparative evaluation with seven existing CPP methods demonstrates significant improvements in performance in terms of coverage time, number of turns, trajectory length, and overlap ratio, while preventing the formation of coverage holes. Finally, C* is comparatively evaluated on two different CPP applications using 1) energy-constrained robots and 2) multi-robot teams.
- North America > United States > Connecticut > Tolland County > Storrs (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Singapore (0.04)
BridgeDrive: Diffusion Bridge Policy for Closed-Loop Trajectory Planning in Autonomous Driving
Liu, Shu, Chen, Wenlin, Li, Weihao, Wang, Zheng, Yang, Lijin, Huang, Jianing, Zhang, Yipin, Huang, Zhongzhan, Cheng, Ze, Yang, Hao
Diffusion-based planners have shown great promise for autonomous driving due to their ability to capture multi-modal driving behaviors. However, guiding these models effectively in reactive, closed-loop environments remains a significant challenge. Simple conditioning often fails to provide sufficient guidance in complex and dynamic driving scenarios. Recent work attempts to use typical expert driving behaviors (i.e., anchors) to guide diffusion models but relies on a truncated schedule, which introduces theoretical inconsistencies and can compromise performance. To address this, we introduce BridgeDrive, a novel anchor-guided diffusion bridge policy for closed-loop trajectory planning. Our approach provides a principled diffusion framework that effectively translates anchors into fine-grained trajectory plans, appropriately responding to varying traffic conditions. Our planner is compatible with efficient ODE solvers, a critical factor for real-time autonomous driving deployment. We achieve state-of-the-art performance on the Bench2Drive benchmark, improving the success rate by 7.72% over prior arts. Closed-loop planning with reactive agents is a critical challenge in autonomous driving, which requires effective interaction with complex and dynamic traffic environments (Jia et al., 2024).
IGUANA: Immersive Guidance, Navigation, and Control for Consumer UAV
Victor, Victor, Krisanty, Tania, McGinity, Matthew, Gumhold, Stefan, Aßmann, Uwe
As the markets for unmanned aerial vehicles (UAVs) and mixed reality (MR) headsets continue to grow, recent research has increasingly explored their integration, which enables more intuitive, immersive, and situationally aware control systems. We present IGUANA, an MR-based immersive guidance, navigation, and control system for consumer UAVs. IGUANA introduces three key elements beyond conventional control interfaces: (1) a 3D terrain map interface with draggable waypoint markers and live camera preview for high-level control, (2) a novel spatial control metaphor that uses a virtual ball as a physical analogy for low-level control, and (3) a spatial overlay that helps track the UAV when it is not visible with the naked eye or visual line of sight is interrupted. We conducted a user study to evaluate our design, both quantitatively and qualitatively, and found that (1) the 3D map interface is intuitive and easy to use, relieving users from manual control and suggesting improved accuracy and consistency with lower perceived workload relative to conventional dual-stick controller, (2) the virtual ball interface is intuitive but limited by the lack of physical feedback, and (3) the spatial overlay is very useful in enhancing the users' situational awareness.
- North America > Canada > Quebec > Montreal (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- Europe > Germany > Saxony > Dresden (0.05)
- (9 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study > Negative Result (0.68)
- Government (1.00)
- Information Technology > Robotics & Automation (0.88)
Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation
Wei, Meng, Wan, Chenyang, Peng, Jiaqi, Yu, Xiqian, Yang, Yuqiang, Feng, Delin, Cai, Wenzhe, Zhu, Chenming, Wang, Tai, Pang, Jiangmiao, Liu, Xihui
While recent large vision-language models (VLMs) have improved generalization in vision-language navigation (VLN), existing methods typically rely on end-to-end pipelines that map vision-language inputs directly to short-horizon discrete actions. Such designs often produce fragmented motions, incur high latency, and struggle with real-world challenges like dynamic obstacle avoidance. We propose DualVLN, the first dual-system VLN foundation model that synergistically integrates high-level reasoning with low-level action execution. System 2, a VLM-based global planner, "grounds slowly" by predicting mid-term waypoint goals via image-grounded reasoning. System 1, a lightweight, multi-modal conditioning Diffusion Transformer policy, "moves fast" by leveraging both explicit pixel goals and latent features from System 2 to generate smooth and accurate trajectories. The dual-system design enables robust real-time control and adaptive local decision-making in complex, dynamic environments. By decoupling training, the VLM retains its generalization, while System 1 achieves interpretable and effective local navigation. DualVLN outperforms prior methods across all VLN benchmarks and real-world experiments demonstrate robust long-horizon planning and real-time adaptability in dynamic environments.
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Hong Kong (0.04)