Goto

Collaborating Authors

 Conover, Damon


HessianForge: Scalable LiDAR reconstruction with Physics-Informed Neural Representation and Smoothness Energy Constraints

arXiv.org Artificial Intelligence

Accurate and efficient 3D mapping of large-scale outdoor environments from LiDAR measurements is a fundamental challenge in robotics, particularly towards ensuring smooth and artifact-free surface reconstructions. Although the state-of-the-art methods focus on memory-efficient neural representations for high-fidelity surface generation, they often fail to produce artifact-free manifolds, with artifacts arising due to noisy and sparse inputs. To address this issue, we frame surface mapping as a physics-informed energy optimization problem, enforcing surface smoothness by optimizing an energy functional that penalizes sharp surface ridges. Specifically, we propose a deep learning based approach that learns the signed distance field (SDF) of the surface manifold from raw LiDAR point clouds using a physics-informed loss function that optimizes the $L_2$-Hessian energy of the surface. Our learning framework includes a hierarchical octree based input feature encoding and a multi-scale neural network to iteratively refine the signed distance field at different scales of resolution. Lastly, we introduce a test-time refinement strategy to correct topological inconsistencies and edge distortions that can arise in the generated mesh. We propose a \texttt{CUDA}-accelerated least-squares optimization that locally adjusts vertex positions to enforce feature-preserving smoothing. We evaluate our approach on large-scale outdoor datasets and demonstrate that our approach outperforms current state-of-the-art methods in terms of improved accuracy and smoothness. Our code is available at \href{https://github.com/HrishikeshVish/HessianForge/}{https://github.com/HrishikeshVish/HessianForge/}


Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning

arXiv.org Artificial Intelligence

Autonomous robots are being employed in several mapping and data collection tasks due to their efficiency and low labor costs. In these tasks, the robots are required to map targets-of-interest in an unknown environment while constrained to a given resource budget such as path length or mission time. This is a challenging problem as each robot has to not only detect and avoid collisions from static obstacles in the environment but also has to model other robots' trajectories to avoid inter-robot collisions. We propose a novel deep reinforcement learning approach for multi-robot informative path planning to map targets-of-interest in an unknown 3D environment. A key aspect of our approach is an augmented graph that models other robots' trajectories to enable planning for communication and inter-robot collision avoidance. We train our decentralized reinforcement learning policy via the centralized training and decentralized execution paradigm. Once trained, our policy is also scalable to varying number of robots and does not require re-training. Our approach outperforms other state-of-the-art multi-robot target mapping approaches by 33.75% in terms of the number of discovered targets-of-interest. We open-source our code and model at: https://github.com/AccGen99/marl_ipp


COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models

arXiv.org Artificial Intelligence

We propose a novel framework COLLAGE for generating collaborative agent-object-agent interactions by leveraging large language models (LLMs) and hierarchical motion-specific vector-quantized variational autoencoders (VQ-VAEs). Our model addresses the lack of rich datasets in this domain by incorporating the knowledge and reasoning abilities of LLMs to guide a generative diffusion model. The hierarchical VQ-VAE architecture captures different motion-specific characteristics at multiple levels of abstraction, avoiding redundant concepts and enabling efficient multi-resolution representation. We introduce a diffusion model that operates in the latent space and incorporates LLM-generated motion planning cues to guide the denoising process, resulting in prompt-specific motion generation with greater control and diversity. Experimental results on the CORE-4D, and InterHuman datasets demonstrate the effectiveness of our approach in generating realistic and diverse collaborative human-object-human interactions, outperforming state-of-the-art methods. Our work opens up new possibilities for modeling complex interactions in various domains, such as robotics, graphics and computer vision.


Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion

arXiv.org Artificial Intelligence

By framing reinforcement learning as a sequence modeling problem, recent work has enabled the use of generative models, such as diffusion models, for planning. While these models are effective in predicting long-horizon state trajectories in deterministic environments, they face challenges in dynamic settings with moving obstacles. Effective collision avoidance demands continuous monitoring and adaptive decision-making. While replanning at every timestep could ensure safety, it introduces substantial computational overhead due to the repetitive prediction of overlapping state sequences -- a process that is particularly costly with diffusion models, known for their intensive iterative sampling procedure. We propose an adaptive generative planning approach that dynamically adjusts replanning frequency based on the uncertainty of action predictions. Our method minimizes the need for frequent, computationally expensive, and redundant replanning while maintaining robust collision avoidance performance. In experiments, we obtain a 13.5% increase in the mean trajectory length and a 12.7% increase in mean reward over long-horizon planning, indicating a reduction in collision rates and an improved ability to navigate the environment safely.


Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM

arXiv.org Artificial Intelligence

We introduce Go-SLAM, a novel framework that utilizes 3D Gaussian Splatting SLAM to reconstruct dynamic environments while embedding object-level information within the scene representations. This framework employs advanced object segmentation techniques, assigning a unique identifier to each Gaussian splat that corresponds to the object it represents. Consequently, our system facilitates open-vocabulary querying, allowing users to locate objects using natural language descriptions. Furthermore, the framework features an optimal path generation module that calculates efficient navigation paths for robots toward queried objects, considering obstacles and environmental uncertainties. Comprehensive evaluations in various scene settings demonstrate the effectiveness of our approach in delivering high-fidelity scene reconstructions, precise object segmentation, flexible object querying, and efficient robot path planning. This work represents an additional step forward in bridging the gap between 3D scene reconstruction, semantic object understanding, and real-time environment interactions.


VERN: Vegetation-aware Robot Navigation in Dense Unstructured Outdoor Environments

arXiv.org Artificial Intelligence

We propose a novel method for autonomous legged robot navigation in densely vegetated environments with a variety of pliable/traversable and non-pliable/untraversable vegetation. We present a novel few-shot learning classifier that can be trained on a few hundred RGB images to differentiate flora that can be navigated through, from the ones that must be circumvented. Using the vegetation classification and 2D lidar scans, our method constructs a vegetation-aware traversability cost map that accurately represents the pliable and non-pliable obstacles with lower, and higher traversability costs, respectively. Our cost map construction accounts for misclassifications of the vegetation and further lowers the risk of collisions, freezing and entrapment in vegetation during navigation. Furthermore, we propose holonomic recovery behaviors for the robot for scenarios where it freezes, or gets physically entrapped in dense, pliable vegetation. We demonstrate our method on a Boston Dynamics Spot robot in real-world unstructured environments with sparse and dense tall grass, bushes, trees, etc. We observe an increase of 25-90% in success rates, 10-90% decrease in freezing rate, and up to 65% decrease in the false positive rate compared to existing methods.