Tokekar, Pratap
RE-MOVE: An Adaptive Policy Design for Robotic Navigation Tasks in Dynamic Environments via Language-Based Feedback
Chakraborty, Souradip, Weerakoon, Kasun, Poddar, Prithvi, Elnoor, Mohamed, Narayanan, Priya, Busart, Carl, Tokekar, Pratap, Bedi, Amrit Singh, Manocha, Dinesh
Abstract-- Reinforcement learning-based policies for continuous control robotic navigation tasks often fail to adapt to changes in the environment during real-time deployment, which may result in catastrophic failures. To address this limitation, we propose a novel approach called RE-MOVE (REquest help and MOVE on) to adapt already trained policy to real-time changes in the environment without re-training via utilizing a language-based feedback. The proposed approach essentially boils down to addressing two main challenges of (1) when to ask for feedback and, if received, (2) how to incorporate feedback into trained policies. RE-MOVE incorporates an epistemic uncertainty-based framework to determine the optimal time to request instructions-based feedback. This figure shows robot navigation using our RE-MOVE processing (NLP) paradigm with efficient, prompt design and approach with a language-based feedback scenario. To in dynamic scenes, RE-MOVE identifies the uncertainties show the efficacy of the proposed approach, we performed that appear in the observation space (i.e., a LiDAR laser scanbased extensive synthetic and real-world evaluations in several testtime 2D cost map in our context) and requests assistance from a dynamic navigation scenarios. Such assistance is essential in scenarios where the laser scan in up to 80% enhancement in the attainment of successful goals, misleadingly detects pliable regions (i.e., perceptually deceptive yet coupled with a reduction of 13.50% in the normalized trajectory navigable objects such as hanging clothes, curtains, thin tall grass, length, as compared to alternative approaches, particularly in etc.) as solid obstacles due to the sensing limitations of the LiDAR. To tackle this, we quantify epistemic uncertainty Reinforcement learning (RL) has gained popularity for precisely, considering specific design considerations within navigating complex, dynamic environments [1].
UIVNAV: Underwater Information-driven Vision-based Navigation via Imitation Learning
Lin, Xiaomin, Karapetyan, Nare, Joshi, Kaustubh, Liu, Tianchen, Chopra, Nikhil, Yu, Miao, Tokekar, Pratap, Aloimonos, Yiannis
Autonomous navigation in the underwater environment is challenging due to limited visibility, dynamic changes, and the lack of a cost-efficient accurate localization system. We introduce UIVNav, a novel end-to-end underwater navigation solution designed to drive robots over Objects of Interest (OOI) while avoiding obstacles, without relying on localization. UIVNav uses imitation learning and is inspired by the navigation strategies used by human divers who do not rely on localization. UIVNav consists of the following phases: (1) generating an intermediate representation (IR), and (2) training the navigation policy based on human-labeled IR. By training the navigation policy on IR instead of raw data, the second phase is domain-invariant -- the navigation policy does not need to be retrained if the domain or the OOI changes. We show this by deploying the same navigation policy for surveying two different OOIs, oyster and rock reefs, in two different domains, simulation, and a real pool. We compared our method with complete coverage and random walk methods which showed that our method is more efficient in gathering information for OOIs while also avoiding obstacles. The results show that UIVNav chooses to visit the areas with larger area sizes of oysters or rocks with no prior information about the environment or localization. Moreover, a robot using UIVNav compared to complete coverage method surveys on average 36% more oysters when traveling the same distances. We also demonstrate the feasibility of real-time deployment of UIVNavin pool experiments with BlueROV underwater robot for surveying a bed of oyster shells.
Efficiently Identifying Hotspots in a Spatially Varying Field with Multiple Robots
Suryan, Varun, Tokekar, Pratap
In this paper, we present algorithms to identify environmental hotspots using mobile sensors. We examine two approaches: one involving a single robot and another using multiple robots coordinated through a decentralized robot system. We introduce an adaptive algorithm that does not require precise knowledge of Gaussian Processes (GPs) hyperparameters, making the modeling process more flexible. The robots operate for a pre-defined time in the environment. The multi-robot system uses Voronoi partitioning to divide tasks and a Monte Carlo Tree Search for optimal path planning. Our tests on synthetic and a real-world dataset of Chlorophyll density from a Pacific Ocean sub-region suggest that accurate estimation of GP hyperparameters may not be essential for hotspot detection, potentially simplifying environmental monitoring tasks.
Pred-NBV: Prediction-guided Next-Best-View for 3D Object Reconstruction
Dhami, Harnaik, Sharma, Vishnu D., Tokekar, Pratap
Prediction-based active perception has shown the potential to improve the navigation efficiency and safety of the robot by anticipating the uncertainty in the unknown environment. The existing works for 3D shape prediction make an implicit assumption about the partial observations and therefore cannot be used for real-world planning and do not consider the control effort for next-best-view planning. We present Pred-NBV, a realistic object shape reconstruction method consisting of PoinTr-C, an enhanced 3D prediction model trained on the ShapeNet dataset, and an information and control effort-based next-best-view method to address these issues. Pred-NBV shows an improvement of 25.46% in object coverage over the traditional methods in the AirSim simulator, and performs better shape completion than PoinTr, the state-of-the-art shape completion model, even on real data obtained from a Velodyne 3D LiDAR mounted on DJI M600 Pro.
ProxMaP: Proximal Occupancy Map Prediction for Efficient Indoor Robot Navigation
Sharma, Vishnu Dutt, Chen, Jingxi, Tokekar, Pratap
Planning a path for a mobile robot typically requires building a map (e.g., an occupancy grid) of the environment as the robot moves around. While navigating in an unknown environment, the map built by the robot online may have many as-yet-unknown regions. A conservative planner may avoid such regions taking a longer time to navigate to the goal. Instead, if a robot is able to correctly predict the occupancy in the occluded regions, the robot may navigate efficiently. We present a self-supervised occupancy prediction technique, ProxMaP, to predict the occupancy within the proximity of the robot to enable faster navigation. We show that ProxMaP generalizes well across realistic and real domains, and improves the robot navigation efficiency in simulation by 12.40% against a traditional navigation method. We share our findings and code at https://raaslab.org/projects/ProxMaP.
Data-Driven Distributionally Robust Optimal Control with State-Dependent Noise
Liu, Rui, Shi, Guangyao, Tokekar, Pratap
Distributionally Robust Optimal Control (DROC) is a technique that enables robust control in a stochastic setting when the true distribution is not known. Traditional DROC approaches require given ambiguity sets or a KL divergence bound to represent the distributional uncertainty. These may not be known a priori and may require hand-crafting. In this paper, we lift this assumption by introducing a data-driven technique for estimating the uncertainty and a bound for the KL divergence. We call this technique D3ROC. To evaluate the effectiveness of our approach, we consider a navigation problem for a car-like robot with unknown noise distributions. The results demonstrate that D3ROC provides robust and efficient control policies that outperform the iterative Linear Quadratic Gaussian (iLQG) control. The results also show the effectiveness of our proposed approach in handling different noise distributions.
Where to Drop Sensors from Aerial Robots to Monitor a Surface-Level Phenomenon?
Shek, Chak Lam, Shi, Guangyao, Asghar, Ahmad Bilal, Tokekar, Pratap
We consider the problem of routing a team of energy-constrained Unmanned Aerial Vehicles (UAVs) to drop unmovable sensors for monitoring a task area in the presence of stochastic wind disturbances. In prior work on mobile sensor routing problems, sensors and their carrier are one integrated platform, and sensors are assumed to be able to take measurements at exactly desired locations. By contrast, airdropping the sensors onto the ground can introduce stochasticity in the landing locations of the sensors. We focus on addressing this stochasticity in sensor locations from the path-planning perspective. Specifically, we formulate the problem (Multi-UAV Sensor Drop) as a variant of the Submodular Team Orienteering Problem with one additional constraint on the number of sensors on each UAV. The objective is to maximize the Mutual Information between the phenomenon at Points of Interest (PoIs) and the measurements that sensors will take at stochastic locations. We show that such an objective is computationally expensive to evaluate. To tackle this challenge, we propose a surrogate objective with a closed-form expression based on the expected mean and expected covariance of the Gaussian Process. We propose a heuristic algorithm to solve the optimization problem with the surrogate objective. The formulation and the algorithms are validated through extensive simulations.
MAP-NBV: Multi-agent Prediction-guided Next-Best-View Planning for Active 3D Object Reconstruction
Dhami, Harnaik, Sharma, Vishnu D., Tokekar, Pratap
We propose MAP-NBV, a prediction-guided active algorithm for 3D reconstruction with multi-agent systems. Prediction-based approaches have shown great improvement in active perception tasks by learning the cues about structures in the environment from data. But these methods primarily focus on single-agent systems. We design a next-best-view approach that utilizes geometric measures over the predictions and jointly optimizes the information gain and control effort for efficient collaborative 3D reconstruction of the object. Our method achieves 22.75% improvement over the prediction-based single-agent approach and 15.63% improvement over the non-predictive multi-agent approach. We make our code publicly available through our project website: http://raaslab.org/projects/MAPNBV/
GATSBI: An Online GTSP-Based Algorithm for Targeted Surface Bridge Inspection
Dhami, Harnaik, Yu, Kevin, Williams, Troi, Vajipey, Vineeth, Tokekar, Pratap
We study the problem of visual surface inspection of a bridge for defects using an Unmanned Aerial Vehicle (UAV). We do not assume that the geometric model of the bridge is known beforehand. Our planner, termed GATSBI, plans a path in a receding horizon fashion to inspect all points on the surface of the bridge. The input to GATSBI consists of a 3D occupancy map created online with LiDAR scans. Occupied voxels corresponding to the bridge in this map are semantically segmented and used to create a bridge-only occupancy map. Inspecting a bridge voxel requires the UAV to take images from a desired viewing angle and distance. We then create a Generalized Traveling Salesperson Problem (GTSP) instance to cluster candidate viewpoints for inspecting the bridge voxels and use an off-the-shelf GTSP solver to find the optimal path for the given instance. As the algorithm sees more parts of the environment over time, it replans the path to inspect novel parts of the bridge while avoiding obstacles. We evaluate the performance of our algorithm through high-fidelity simulations conducted in AirSim and real-world experiments. We compare the performance of GATSBI with a classical exploration algorithm. Our evaluation reveals that targeting the inspection to only the segmented bridge voxels and planning carefully using a GTSP solver leads to a more efficient and thorough inspection than the baseline algorithm.
Multi-Agent Deep Reinforcement Learning For Persistent Monitoring With Sensing, Communication, and Localization Constraints
Mishra, Manav, Poddar, Prithvi, Agarwal, Rajat, Chen, Jingxi, Tokekar, Pratap, Sujit, P. B.
Determining multi-robot motion policies for persistently monitoring a region with limited sensing, communication, and localization constraints in non-GPS environments is a challenging problem. To take the localization constraints into account, in this paper, we consider a heterogeneous robotic system consisting of two types of agents: anchor agents with accurate localization capability and auxiliary agents with low localization accuracy. To localize itself, the auxiliary agents must be within the communication range of an {anchor}, directly or indirectly. The robotic team's objective is to minimize environmental uncertainty through persistent monitoring. We propose a multi-agent deep reinforcement learning (MARL) based architecture with graph convolution called Graph Localized Proximal Policy Optimization (GALOPP), which incorporates the limited sensor field-of-view, communication, and localization constraints of the agents along with persistent monitoring objectives to determine motion policies for each agent. We evaluate the performance of GALOPP on open maps with obstacles having a different number of anchor and auxiliary agents. We further study (i) the effect of communication range, obstacle density, and sensing range on the performance and (ii) compare the performance of GALOPP with non-RL baselines, namely, greedy search, random search, and random search with communication constraint. For its generalization capability, we also evaluated GALOPP in two different environments -- 2-room and 4-room. The results show that GALOPP learns the policies and monitors the area well. As a proof-of-concept, we perform hardware experiments to demonstrate the performance of GALOPP.