hptr
Supplementary Material of Real-Time Motion Prediction via Heterogeneous Polyline Transformer with Relative Pose Encoding Anonymous Author(s) Affiliation Address email
The feed-forward hidden dimension of Transformers is set to 1024. AC-to-all Transformer decoders, have 2 layers. The same setup is used for both the WOMD dataset and the A V2 dataset. In the following, we report the configuration of ablation models. VRAM (RTX 3090 in our case) because they require more GPU memory at training time.
- Europe > Switzerland > Zürich > Zürich (0.15)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
- Transportation > Ground > Road (0.95)
- Automobiles & Trucks (0.68)
Real-Time Motion Prediction via Heterogeneous Polyline Transformer with Relative Pose Encoding
The real-world deployment of an autonomous driving system requires its components to run on-board and in real-time, including the motion prediction module that predicts the future trajectories of surrounding traffic participants. Existing agent-centric methods have demonstrated outstanding performance on public benchmarks. However, they suffer from high computational overhead and poor scalability as the number of agents to be predicted increases. To address this problem, we introduce the K-nearest neighbor attention with relative pose encoding (KNARPE), a novel attention mechanism allowing the pairwise-relative representation to be used by Transformers. Then, based on KNARPE we present the Heterogeneous Polyline Transformer with Relative pose encoding (HPTR), a hierarchical framework enabling asynchronous token update during the online inference. By sharing contexts among agents and reusing the unchanged contexts, our approach is as efficient as scene-centric methods, while performing on par with state-of-the-art agent-centric methods. Experiments on Waymo and Argoverse-2 datasets show that HPTR achieves superior performance among end-to-end methods that do not apply expensive post-processing or model ensembling. The code is available at https://github.com/zhejz/HPTR.
- Europe > Switzerland > Zürich > Zürich (0.15)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
- Transportation > Ground > Road (0.95)
- Automobiles & Trucks (0.68)
TPK: Trustworthy Trajectory Prediction Integrating Prior Knowledge For Interpretability and Kinematic Feasibility
Baden, Marius, Abouelazm, Ahmed, Hubschneider, Christian, Wu, Yin, Slieter, Daniel, Zöllner, J. Marius
Trajectory prediction is crucial for autonomous driving, enabling vehicles to navigate safely by anticipating the movements of surrounding road users. However, current deep learning models often lack trustworthiness as their predictions can be physically infeasible and illogical to humans. To make predictions more trustworthy, recent research has incorporated prior knowledge, like the social force model for modeling interactions and kinematic models for physical realism. However, these approaches focus on priors that suit either vehicles or pedestrians and do not generalize to traffic with mixed agent classes. We propose incorporating interaction and kinematic priors of all agent classes--vehicles, pedestrians, and cyclists with class-specific interaction layers to capture agent behavioral differences. To improve the interpretability of the agent interactions, we introduce DG-SFM, a rule-based interaction importance score that guides the interaction layer. To ensure physically feasible predictions, we proposed suitable kinematic models for all agent classes with a novel pedestrian kinematic model. We benchmark our approach on the Argoverse 2 dataset, using the state-of-the-art transformer HPTR as our baseline. Experiments demonstrate that our method improves interaction interpretability, revealing a correlation between incorrect predictions and divergence from our interaction prior. Even though incorporating the kinematic models causes a slight decrease in accuracy, they eliminate infeasible trajectories found in the dataset and the baseline model. Thus, our approach fosters trust in trajectory prediction as its interaction reasoning is interpretable, and its predictions adhere to physics.
- Transportation > Ground > Road (0.48)
- Information Technology > Robotics & Automation (0.34)
Boundary-Guided Trajectory Prediction for Road Aware and Physically Feasible Autonomous Driving
Abouelazm, Ahmed, Liu, Mianzhi, Hubschneider, Christian, Wu, Yin, Slieter, Daniel, Zöllner, J. Marius
-- Accurate prediction of surrounding road users' trajectories is essential for safe and efficient autonomous driving. While deep learning models have improved performance, challenges remain in preventing off-road predictions and ensuring kinematic feasibility. Existing methods incorporate road-awareness modules and enforce kinematic constraints but lack plausibility guarantees and often introduce trade-offs in complexity and flexibility. This paper proposes a novel framework that formulates trajectory prediction as a constrained regression guided by permissible driving directions and their boundaries. Using the agent's current state and an HD map, our approach defines the valid boundaries and ensures on-road predictions by training the network to learn superimposed paths between left and right boundary polylines. T o guarantee feasibility, the model predicts acceleration profiles that determine the vehicle's travel distance along these paths while adhering to kinematic constraints. We evaluate our approach on the Argoverse-2 dataset against the HPTR baseline. Our approach shows a slight decrease in benchmark metrics compared to HPTR but notably improves final displacement error and eliminates infeasible trajectories. Moreover, the proposed approach has a superior generalization to less prevalent maneuvers and unseen out-of-distribution scenarios, reducing the off-road rate under adversarial attacks from 66% to just 1%.
- Transportation > Ground > Road (0.85)
- Information Technology > Robotics & Automation (0.85)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Real-Time Motion Prediction via Heterogeneous Polyline Transformer with Relative Pose Encoding
The real-world deployment of an autonomous driving system requires its components to run on-board and in real-time, including the motion prediction module that predicts the future trajectories of surrounding traffic participants. Existing agent-centric methods have demonstrated outstanding performance on public benchmarks. However, they suffer from high computational overhead and poor scalability as the number of agents to be predicted increases. To address this problem, we introduce the K-nearest neighbor attention with relative pose encoding (KNARPE), a novel attention mechanism allowing the pairwise-relative representation to be used by Transformers. Then, based on KNARPE we present the Heterogeneous Polyline Transformer with Relative pose encoding (HPTR), a hierarchical framework enabling asynchronous token update during the online inference.
TrafficBots V1.5: Traffic Simulation via Conditional VAEs and Transformers with Relative Pose Encoding
Zhang, Zhejun, Sakaridis, Christos, Van Gool, Luc
In this technical report we present TrafficBots V1.5, a baseline method for the closed-loop simulation of traffic agents. TrafficBots V1.5 achieves baseline-level performance and a 3rd place ranking in the Waymo Open Sim Agents Challenge (WOSAC) 2024. It is a simple baseline that combines TrafficBots, a CVAE-based multi-agent policy conditioned on each agent's individual destination and personality, and HPTR, the heterogeneous polyline transformer with relative pose encoding. To improve the performance on the WOSAC leaderboard, we apply scheduled teacher-forcing at the training time and we filter the sampled scenarios at the inference time. The code is available at https://github.com/zhejz/TrafficBotsV1.5.
Differential privacy and robust statistics in high dimensions
Liu, Xiyang, Kong, Weihao, Oh, Sewoong
We introduce a universal framework for characterizing the statistical efficiency of a statistical estimation problem with differential privacy guarantees. Our framework, which we call High-dimensional Propose-Test-Release (HPTR), builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism. Gluing all these together is the concept of resilience, which is central to robust statistical estimation. Resilience guides the design of the algorithm, the sensitivity analysis, and the success probability analysis of the test step in Propose-Test-Release. The key insight is that if we design an exponential mechanism that accesses the data only via one-dimensional robust statistics, then the resulting local sensitivity can be dramatically reduced. Using resilience, we can provide tight local sensitivity bounds. These tight bounds readily translate into near-optimal utility guarantees in several cases. We give a general recipe for applying HPTR to a given instance of a statistical estimation problem and demonstrate it on canonical problems of mean estimation, linear regression, covariance estimation, and principal component analysis. We introduce a general utility analysis technique that proves that HPTR nearly achieves the optimal sample complexity under several scenarios studied in the literature.
- North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)