Lawson, Daniel
Differentiable Composite Neural Signed Distance Fields for Robot Navigation in Dynamic Indoor Environments
Bukhari, S. Talha, Lawson, Daniel, Qureshi, Ahmed H.
Neural Signed Distance Fields (SDFs) provide a differentiable environment representation to readily obtain collision checks and well-defined gradients for robot navigation tasks. However, updating neural SDFs as the scene evolves entails re-training, which is tedious, time consuming, and inefficient, making it unsuitable for robot navigation with limited field-of-view in dynamic environments. Towards this objective, we propose a compositional framework of neural SDFs to solve robot navigation in indoor environments using only an onboard RGB-D sensor. Our framework embodies a dual mode procedure for trajectory optimization, with different modes using complementary methods of modeling collision costs and collision avoidance gradients. The primary stage queries the robot body's SDF, swept along the route to goal, at the obstacle point cloud, enabling swift local optimization of trajectories. The secondary stage infers the visible scene's SDF by aligning and composing the SDF representations of its constituents, providing better informed costs and gradients for trajectory optimization. The dual mode procedure combines the best of both stages, achieving a success rate of 98%, 14.4% higher than baseline with comparable amortized plan time on iGibson 2.0. We also demonstrate its effectiveness in adapting to real-world indoor scenarios.
EarthquakeNPP: Benchmark Datasets for Earthquake Forecasting with Neural Point Processes
Stockman, Samuel, Lawson, Daniel, Werner, Maximilian
Classical point process models, such as the epidemic-type aftershock sequence (ETAS) model, have been widely used for forecasting the event times and locations of earthquakes for decades. Recent advances have led to Neural Point Processes (NPPs), which promise greater flexibility and improvements over classical models. However, the currently-used benchmark dataset for NPPs does not represent an up-to-date challenge in the seismological community since it lacks a key earthquake sequence from the region and improperly splits training and testing data. Furthermore, initial earthquake forecast benchmarking lacks a comparison to state-of-the-art earthquake forecasting models typically used by the seismological community. To address these gaps, we introduce EarthquakeNPP: a collection of benchmark datasets to facilitate testing of NPPs on earthquake data, accompanied by a credible implementation of the ETAS model. The datasets cover a range of small to large target regions within California, dating from 1971 to 2021, and include different methodologies for dataset generation. In a benchmarking experiment, we compare three spatio-temporal NPPs against ETAS and find that none outperform ETAS in either spatial or temporal log-likelihood. These results indicate that current NPP implementations are not yet suitable for practical earthquake forecasting. However, EarthquakeNPP will serve as a platform for collaboration between the seismology and machine learning communities with the goal of improving earthquake predictability.
Co-learning Planning and Control Policies Constrained by Differentiable Logic Specifications
Xiong, Zikang, Lawson, Daniel, Eappen, Joe, Qureshi, Ahmed H., Jagannathan, Suresh
Synthesizing planning and control policies in robotics is a fundamental task, further complicated by factors such as complex logic specifications and high-dimensional robot dynamics. This paper presents a novel reinforcement learning approach to solving high-dimensional robot navigation tasks with complex logic specifications by co-learning planning and control policies. Notably, this approach significantly reduces the sample complexity in training, allowing us to train high-quality policies with much fewer samples compared to existing reinforcement learning algorithms. In addition, our methodology streamlines complex specification extraction from map images and enables the efficient generation of long-horizon robot motion paths across different map layouts. Moreover, our approach also demonstrates capabilities for high-dimensional control and avoiding suboptimal policies via policy alignment. The efficacy of our approach is demonstrated through experiments involving simulated high-dimensional quadruped robot dynamics and a real-world differential drive robot (TurtleBot3) under different types of task specifications.
Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies
Lawson, Daniel, Qureshi, Ahmed H.
Recent work has shown the promise of creating generalist, transformer-based, models for language, vision, and sequential decision-making problems. To create such models, we generally require centralized training objectives, data, and compute. It is of interest if we can more flexibly create generalist policies by merging together multiple, task-specific, individually trained policies. In this work, we take a preliminary step in this direction through merging, or averaging, subsets of Decision Transformers in parameter space trained on different MuJoCo locomotion problems, forming multi-task models without centralized training. We also demonstrate the importance of various methodological choices when merging policies, such as utilizing common pre-trained initializations, increasing model capacity, and utilizing Fisher information for weighting parameter importance. In general, we believe research in this direction could help democratize and distribute the process that forms multi-task robotics policies. Our implementation is available at https://github.com/daniellawson9999/merging-decision-transformers.
Control Transformer: Robot Navigation in Unknown Environments through PRM-Guided Return-Conditioned Sequence Modeling
Lawson, Daniel, Qureshi, Ahmed H.
Learning long-horizon tasks such as navigation has presented difficult challenges for successfully applying reinforcement learning to robotics. From another perspective, under known environments, sampling-based planning can robustly find collision-free paths in environments without learning. In this work, we propose Control Transformer that models return-conditioned sequences from low-level policies guided by a sampling-based Probabilistic Roadmap (PRM) planner. We demonstrate that our framework can solve long-horizon navigation tasks using only local information. We evaluate our approach on partially-observed maze navigation with MuJoCo robots, including Ant, Point, and Humanoid. We show that Control Transformer can successfully navigate through mazes and transfer to unknown environments. Additionally, we apply our method to a differential drive robot (Turtlebot3) and show zero-shot sim2real transfer under noisy observations.