manocha
Splatblox: Traversability-Aware Gaussian Splatting for Outdoor Robot Navigation
Chopra, Samarth, Liang, Jing, Seneviratne, Gershom, Lee, Yonghan, Choi, Jaehoon, An, Jianyu, Cheng, Stephen, Manocha, Dinesh
We present Splatblox, a real-time system for autonomous navigation in outdoor environments with dense vegetation, irregular obstacles, and complex terrain. Our method fuses segmented RGB images and LiDAR point clouds using Gaussian Splatting to construct a traversability-aware Euclidean Signed Distance Field (ESDF) that jointly encodes geometry and semantics. Updated online, this field enables semantic reasoning to distinguish traversable vegetation (e.g., tall grass) from rigid obstacles (e.g., trees), while LiDAR ensures 360-degree geometric coverage for extended planning horizons. We validate Splatblox on a quadruped robot and demonstrate transfer to a wheeled platform. In field trials across vegetation-rich scenarios, it outperforms state-of-the-art methods with over 50% higher success rate, 40% fewer freezing incidents, 5% shorter paths, and up to 13% faster time to goal, while supporting long-range missions up to 100 meters. Experiment videos and more details can be found on our project page: https://splatblox.github.io
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Europe > Portugal (0.04)
- Asia (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots > Locomotion (0.48)
FACA: Fair and Agile Multi-Robot Collision Avoidance in Constrained Environments with Dynamic Priorities
Singh, Jaskirat, Chandra, Rohan
Multi-robot systems are increasingly being used for critical applications such as rescuing injured people, delivering food and medicines, and monitoring key areas. These applications usually involve navigating at high speeds through constrained spaces such as small gaps. Navigating such constrained spaces becomes particularly challenging when the space is crowded with multiple heterogeneous agents all of which have urgent priorities. What makes the problem even harder is that during an active response situation, roles and priorities can quickly change on a dime without informing the other agents. In order to complete missions in such environments, robots must not only be safe, but also agile, able to dodge and change course at a moment's notice. In this paper, we propose FACA, a fair and agile collision avoidance approach where robots coordinate their tasks by talking to each other via natural language (just as people do). In FACA, robots balance safety with agility via a novel artificial potential field algorithm that creates an automatic ``roundabout'' effect whenever a conflict arises. Our experiments show that FACA achieves a improvement in efficiency, completing missions more than 3.5X faster than baselines with a time reduction of over 70% while maintaining robust safety margins.
- North America > United States > Virginia (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Transportation > Infrastructure & Services (0.35)
- Transportation > Ground > Road (0.35)
Are LLMs The Way Forward? A Case Study on LLM-Guided Reinforcement Learning for Decentralized Autonomous Driving
Anvar, Timur, Chen, Jeffrey, Wang, Yuyan, Chandra, Rohan
Are LLMs The W ay Forward? Abstract--Autonomous vehicle navigation in complex environments such as dense and fast-moving highways and merging scenarios remains an active area of research. In the past decade, many planning and control approaches have used reinforcement learning (RL) with notable success. However, a key limitation of RL is its reliance on well-specified reward functions, which often fail to capture the full semantic and social complexity of diverse, out-of-distribution situations. As a result, a rapidly growing line of research explores using Large Language Models (LLMs) to replace or supplement RL for direct planning and control, on account of their ability to reason about rich semantic context. However, LLMs present significant drawbacks: they can be unstable in zero-shot safety-critical settings, produce inconsistent outputs, and often depend on expensive API calls with network latency. This motivates our investigation into whether small, locally deployed LLMs ( 14B parameters) can meaningfully support autonomous highway driving through reward shaping rather than direct control. These models are attractive for practical deployment as they can run on a single GPU and avoid external API dependencies. We present a case study comparing RL-only, LLM-only, and hybrid approaches, where LLMs augment RL rewards by scoring state-action transitions during training, while standard RL policies execute at test time.
- North America > United States > Virginia (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Automobiles & Trucks (0.91)
- Transportation > Ground > Road (0.91)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Prompt-Driven Domain Adaptation for End-to-End Autonomous Driving via In-Context RL
Khurram, Aleesha, Moeini, Amir, Zhang, Shangtong, Chandra, Rohan
Abstract--Despite significant progress and advances in autonomous driving, many end-to-end systems still struggle with domain adaptation (DA), such as transferring a policy trained under clear weather to adverse weather conditions. Typical DA strategies in the literature include collecting additional data in the target domain or re-training the model, or both. Both these strategies quickly become impractical as we increase scale and complexity of driving. These limitations have encouraged investigation into few-shot and zero-shot prompt-driven DA at inference time involving LLMs and VLMs. These methods work by adding a few state-action trajectories during inference to the prompt (similar to in-context learning). However, there are two limitations of such an approach: (i) prompt-driven DA methods are currently restricted to perception tasks such as detection and segmentation and (ii) they require expert few-shot data. In this work, we present a new approach to inference-time few-shot prompt-driven DA for closed-loop autonomous driving in adverse weather condition using in-context reinforcement learning (ICRL). Similar to other prompt-driven DA methods, our approach does not require any updates to the model parameters nor does it require additional data collection in adversarial weather regime. Furthermore, our approach advances the state-of-the-art in prompt-driven DA by extending to closed driving using general trajectories observed during inference. Our experiments using the CARLA simulator show that ICRL results in safer, more efficient, and more comfortable driving policies in the target domain compared to state-of-the-art prompt-driven DA baselines.
- North America > United States > Virginia (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Transportation > Ground > Road (1.00)
- Information Technology > Robotics & Automation (0.83)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
GameChat: Multi-LLM Dialogue for Safe, Agile, and Socially Optimal Multi-Agent Navigation in Constrained Environments
Mahadevan, Vagul, Zhang, Shangtong, Chandra, Rohan
Safe, agile, and socially compliant multi-robot navigation in cluttered and constrained environments remains a critical challenge. This is especially difficult with self-interested agents in decentralized settings, where there is no central authority to resolve conflicts induced by spatial symmetry. We address this challenge by proposing a novel approach, GameChat, which facilitates safe, agile, and deadlock-free navigation for both cooperative and self-interested agents. Key to our approach is the use of natural language communication to resolve conflicts, enabling agents to prioritize more urgent tasks and break spatial symmetry in a socially optimal manner. Our algorithm ensures subgame perfect equilibrium, preventing agents from deviating from agreed-upon behaviors and supporting cooperation. Furthermore, we guarantee safety through control barrier functions and preserve agility by minimizing disruptions to agents' planned trajectories. We evaluate GameChat in simulated environments with doorways and intersections. The results show that even in the worst case, GameChat reduces the time for all agents to reach their goals by over 35% from a naive baseline and by over 20% from SMG-CBF in the intersection scenario, while doubling the rate of ensuring the agent with a higher priority task reaches the goal first, from 50% (equivalent to random chance) to a 100% perfect performance at maximizing social welfare.
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
BehAV: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes
Weerakoon, Kasun, Elnoor, Mohamed, Seneviratne, Gershom, Rajagopal, Vignesh, Arul, Senthil Hariharan, Liang, Jing, Jaffar, Mohamed Khalid M, Manocha, Dinesh
We present BehAV, a novel approach for autonomous robot navigation in outdoor scenes guided by human instructions and leveraging Vision Language Models (VLMs). Our method interprets human commands using a Large Language Model (LLM) and categorizes the instructions into navigation and behavioral guidelines. Navigation guidelines consist of directional commands (e.g., "move forward until") and associated landmarks (e.g., "the building with blue windows"), while behavioral guidelines encompass regulatory actions (e.g., "stay on") and their corresponding objects (e.g., "pavements"). We use VLMs for their zero-shot scene understanding capabilities to estimate landmark locations from RGB images for robot navigation. Further, we introduce a novel scene representation that utilizes VLMs to ground behavioral rules into a behavioral cost map. This cost map encodes the presence of behavioral objects within the scene and assigns costs based on their regulatory actions. The behavioral cost map is integrated with a LiDAR-based occupancy map for navigation. To navigate outdoor scenes while adhering to the instructed behaviors, we present an unconstrained Model Predictive Control (MPC)-based planner that prioritizes both reaching landmarks and following behavioral guidelines. We evaluate the performance of BehAV on a quadruped robot across diverse real-world scenarios, demonstrating a 22.49% improvement in alignment with human-teleoperated actions, as measured by Frechet distance, and achieving a 40% higher navigation success rate compared to state-of-the-art methods.
- Energy > Oil & Gas (0.54)
- Transportation (0.46)
PANOS: Payload-Aware Navigation in Offroad Scenarios
Singh, Kartikeya, Turkar, Yash, Aluckal, Christo, Adhivarahan, Charuvarahan, Dantu, Karthik
Nature has evolved humans to walk on different terrains by developing a detailed understanding of their physical characteristics. Similarly, legged robots need to develop their capability to walk on complex terrains with a variety of task-dependent payloads to achieve their goals. However, conventional terrain adaptation methods are susceptible to failure with varying payloads. In this work, we introduce PANOS, a weakly supervised approach that integrates proprioception and exteroception from onboard sensing to achieve a stable gait while walking by a legged robot over various terrains. Our work also provides evidence of its adaptability over varying payloads. We evaluate our method on multiple terrains and payloads using a legged robot. PANOS improves the stability up to 44% without any payload and 53% with 15 lbs payload. We also notice a reduction in the vibration cost of 20% with the payload for various terrain types when compared to state-of-the-art methods.
GAMEOPT+: Improving Fuel Efficiency in Unregulated Heterogeneous Traffic Intersections via Optimal Multi-agent Cooperative Control
Suriyarachchi, Nilesh, Chandra, Rohan, Anantula, Arya, Baras, John S., Manocha, Dinesh
Better fuel efficiency leads to better financial security as well as a cleaner environment. We propose a novel approach for improving fuel efficiency in unstructured and unregulated traffic environments. Existing intelligent transportation solutions for improving fuel efficiency, however, apply only to traffic intersections with sparse traffic or traffic where drivers obey the regulations, or both. We propose GameOpt+, a novel hybrid approach for cooperative intersection control in dynamic, multi-lane, unsignalized intersections. GameOpt+ is a hybrid solution that combines an auction mechanism and an optimization-based trajectory planner. It generates a priority entrance sequence for each agent and computes velocity controls in real-time, taking less than 10 milliseconds even in high-density traffic with over 10,000 vehicles per hour. Compared to fully optimization-based methods, it operates 100 times faster while ensuring fairness, safety, and efficiency. Tested on the SUMO simulator, our algorithm improves throughput by at least 25%, reduces the time to reach the goal by at least 70%, and decreases fuel consumption by 50% compared to auction-based and signaled approaches using traffic lights and stop signs. GameOpt+ is also unaffected by unbalanced traffic inflows, whereas some of the other baselines encountered a decrease in performance in unbalanced traffic inflow environments.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Energy (1.00)
- Automobiles & Trucks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Trajectory Prediction for Robot Navigation using Flow-Guided Markov Neural Operator
Bhaskara, Rashmi, Viswanath, Hrishikesh, Bera, Aniket
Predicting pedestrian movements remains a complex and persistent challenge in robot navigation research. We must evaluate several factors to achieve accurate predictions, such as pedestrian interactions, the environment, crowd density, and social and cultural norms. Accurate prediction of pedestrian paths is vital for ensuring safe human-robot interaction, especially in robot navigation. Furthermore, this research has potential applications in autonomous vehicles, pedestrian tracking, and human-robot collaboration. Therefore, in this paper, we introduce FlowMNO, an Optical Flow-Integrated Markov Neural Operator designed to capture pedestrian behavior across diverse scenarios. Our paper models trajectory prediction as a Markovian process, where future pedestrian coordinates depend solely on the current state. This problem formulation eliminates the need to store previous states. We conducted experiments using standard benchmark datasets like ETH, HOTEL, ZARA1, ZARA2, UCY, and RGB-D pedestrian datasets. Our study demonstrates that FlowMNO outperforms some of the state-of-the-art deep learning methods like LSTM, GAN, and CNN-based approaches, by approximately 86.46% when predicting pedestrian trajectories. Thus, we show that FlowMNO can seamlessly integrate into robot navigation systems, enhancing their ability to navigate crowded areas smoothly.
- North America > United States (0.04)
- Europe > Greece (0.04)
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
DroNeRF: Real-time Multi-agent Drone Pose Optimization for Computing Neural Radiance Fields
Patel, Dipam, Pham, Phu, Bera, Aniket
We present a novel optimization algorithm called DroNeRF for the autonomous positioning of monocular camera drones around an object for real-time 3D reconstruction using only a few images. Neural Radiance Fields or NeRF, is a novel view synthesis technique used to generate new views of an object or scene from a set of input images. Using drones in conjunction with NeRF provides a unique and dynamic way to generate novel views of a scene, especially with limited scene capabilities of restricted movements. Our approach focuses on calculating optimized pose for individual drones while solely depending on the object geometry without using any external localization system. The unique camera positioning during the data-capturing phase significantly impacts the quality of the 3D model. To evaluate the quality of our generated novel views, we compute different perceptual metrics like the Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure(SSIM). Our work demonstrates the benefit of using an optimal placement of various drones with limited mobility to generate perceptually better results.
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)