Goto

Collaborating Authors

 Goehring, Daniel


Beyond In-Distribution Performance: A Cross-Dataset Study of Trajectory Prediction Robustness

arXiv.org Machine Learning

The robustness of trajectory prediction is essential for practical applications in autonomous driving. The advancement of trajectory prediction models is catalyzed through public motion datasets and associated competitions, such as Argoverse 2 (A2) [1], and Waymo Open Motion (WO) [2]. These competitions establish standardized metrics and test protocols and score predictions on test data that is withheld from all competitors and hosted on protected evaluation servers only. This is intended to objectively compare the generalization ability of models to unseen data. However, these withheld test examples still share similarities with the training samples, such as sensor setup, map representation, post-processing, geographic, and scenario selection biases employed during dataset creation. Consequently, the test scores reported in each competition are examples of In-Distribution (ID) testing. To effectively evaluate model generalization, it is essential to test models on truly Out-of-Distribution (OoD) test samples, such as those from different motion datasets. We investigate model generalization across two large-scale motion datasets [3]: Argoverse 2 (A2) and Waymo Open Motion (WO). The WO dataset, with 576k scenarios, is more than twice the size of A2, which contains 250k scenarios.


Multi-modal NeRF Self-Supervision for LiDAR Semantic Segmentation

arXiv.org Artificial Intelligence

LiDAR Semantic Segmentation is a fundamental task in autonomous driving perception consisting of associating each LiDAR point to a semantic label. Fully-supervised models have widely tackled this task, but they require labels for each scan, which either limits their domain or requires impractical amounts of expensive annotations. Camera images, which are generally recorded alongside LiDAR pointclouds, can be processed by the widely available 2D foundation models, which are generic and dataset-agnostic. However, distilling knowledge from 2D data to improve LiDAR perception raises domain adaptation challenges. For example, the classical perspective projection suffers from the parallax effect produced by the position shift between both sensors at their respective capture times. We propose a Semi-Supervised Learning setup to leverage unlabeled LiDAR pointclouds alongside distilled knowledge from the camera images. To self-supervise our model on the unlabeled scans, we add an auxiliary NeRF head and cast rays from the camera viewpoint over the unlabeled voxel features. The NeRF head predicts densities and semantic logits at each sampled ray location which are used for rendering pixel semantics. Concurrently, we query the Segment-Anything (SAM) foundation model with the camera image to generate a set of unlabeled generic masks. We fuse the masks with the rendered pixel semantics from LiDAR to produce pseudo-labels that supervise the pixel predictions. During inference, we drop the NeRF head and run our model with only LiDAR. We show the effectiveness of our approach in three public LiDAR Semantic Segmentation benchmarks: nuScenes, SemanticKITTI and ScribbleKITTI.


Improving Out-of-Distribution Generalization of Trajectory Prediction for Autonomous Driving via Polynomial Representations

arXiv.org Artificial Intelligence

Robustness against Out-of-Distribution (OoD) samples is a key performance indicator of a trajectory prediction model. However, the development and ranking of state-of-the-art (SotA) models are driven by their In-Distribution (ID) performance on individual competition datasets. We present an OoD testing protocol that homogenizes datasets and prediction tasks across two large-scale motion datasets. We introduce a novel prediction algorithm based on polynomial representations for agent trajectory and road geometry on both the input and output sides of the model. With a much smaller model size, training effort, and inference time, we reach near SotA performance for ID testing and significantly improve robustness in OoD testing. Within our OoD testing protocol, we further study two augmentation strategies of SotA models and their effects on model generalization. Highlighting the contrast between ID and OoD performance, we suggest adding OoD testing to the evaluation criteria of trajectory prediction models.


Motion Planning under Uncertainty: Integrating Learning-Based Multi-Modal Predictors into Branch Model Predictive Control

arXiv.org Artificial Intelligence

In complex traffic environments, autonomous vehicles face multi-modal uncertainty about other agents' future behavior. To address this, recent advancements in learningbased motion predictors output multi-modal predictions. We present our novel framework that leverages Branch Model Predictive Control(BMPC) to account for these predictions. The framework includes an online scenario-selection process guided by topology and collision risk criteria. This efficiently selects a minimal set of predictions, rendering the BMPC realtime capable. Additionally, we introduce an adaptive decision postponing strategy that delays the planner's commitment to a single scenario until the uncertainty is resolved. Our comprehensive evaluations in traffic intersection and random highway merging scenarios demonstrate enhanced comfort and safety through our method.


Learning-Aided Warmstart of Model Predictive Control in Uncertain Fast-Changing Traffic

arXiv.org Artificial Intelligence

Model Predictive Control lacks the ability to escape local minima in nonconvex problems. Furthermore, in fast-changing, uncertain environments, the conventional warmstart, using the optimal trajectory from the last timestep, often falls short of providing an adequately close initial guess for the current optimal trajectory. This can potentially result in convergence failures and safety issues. Therefore, this paper proposes a framework for learning-aided warmstarts of Model Predictive Control algorithms. Our method leverages a neural network based multimodal predictor to generate multiple trajectory proposals for the autonomous vehicle, which are further refined by a sampling-based technique. This combined approach enables us to identify multiple distinct local minima and provide an improved initial guess. We validate our approach with Monte Carlo simulations of traffic scenarios.


Industrial Application of 6D Pose Estimation for Robotic Manipulation in Automotive Internal Logistics

arXiv.org Artificial Intelligence

Despite the advances in robotics a large proportion of the of parts handling tasks in the automotive industry's internal logistics are not automated but still performed by humans. A key component to competitively automate these processes is a 6D pose estimation that can handle a large number of different parts, is adaptable to new parts with little manual effort, and is sufficiently accurate and robust with respect to industry requirements. In this context, the question arises as to the current status quo with respect to these measures. To address this we built a representative 6D pose estimation pipeline with state-of-the-art components from economically scalable real to synthetic data generation to pose estimators and evaluated it on automotive parts with regards to a realistic sequencing process. We found that using the data generation approaches, the performance of the trained 6D pose estimators are promising, but do not meet industry requirements. We reveal that the reason for this is the inability of the estimators to provide reliable uncertainties for their poses, rather than the ability of to provide sufficiently accurate poses. In this context we further analyzed how RGB- and RGB-D-based approaches compare against this background and show that they are differently vulnerable to the domain gap induced by synthetic data.


An Empirical Bayes Analysis of Object Trajectory Representation Models

arXiv.org Artificial Intelligence

Linear trajectory models provide mathematical advantages to autonomous driving applications such as motion prediction. However, linear models' expressive power and bias for real-world trajectories have not been thoroughly analyzed. We present an in-depth empirical analysis of the trade-off between model complexity and fit error in modelling object trajectories. We analyze vehicle, cyclist, and pedestrian trajectories. Our methodology estimates observation noise and prior distributions over model parameters from several large-scale datasets. Incorporating these priors can then regularize prediction models. Our results show that linear models do represent real-world trajectories with high fidelity at very moderate model complexity. This suggests the feasibility of using linear trajectory models in future motion prediction systems with inherent mathematical advantages.