AITopics | argoverse 2

Flow: An Efficient Multi-frame Scene Flow Estimation Method

Neural Information Processing SystemsJun-19-2026, 01:48:14 GMT

While recent trends shift towards multi-frame reasoning, they suffer from rapidly escalating computational costs as the number of frames grows. To leverage temporal information more efficiently, we propose DeltaFlow ( Flow), a lightweight 3D framework that captures motion cues via a scheme, extracting temporal features with minimal computational cost, regardless of the number of frames. Additionally, scene flow estimation faces challenges such as imbalanced object class distributions and motion inconsistency. To tackle these issues, we introduce a Category-Balanced Loss to enhance learning across underrepresented classes and an Instance Consistency Loss to enforce coherent object motion, improving flow accuracy. Extensive evaluations on the Argoverse 2, Waymo and nuScenes datasets show that Flow achieves state-of-the-art performance with up to 22% lower error and 2 faster inference compared to the next-best multi-frame supervised method, while also demonstrating a strong cross-domain generalization ability.

artificial intelligence, machine learning, object-oriented architecture, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.68)
Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.48)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.47)

Add feedback

SDTagNet: Leveraging Text-Annotated Navigation Maps for Online HDMap Construction

Neural Information Processing SystemsJun-15-2026, 21:42:09 GMT

Autonomous vehicles rely on detailed and accurate environmental information to operate safely. High definition (HD) maps offer a promising solution, but their high maintenance cost poses a significant barrier to scalable deployment. This challenge is addressed by online HD map construction methods, which generate local HD maps from live sensor data. However, these methods are inherently limited by the short perception range of onboard sensors. To overcome this limitation and improve general performance, recent approaches have explored the use of standard definition (SD) maps as prior, which are significantly easier to maintain.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Transportation > Ground > Road (0.93)
Transportation > Infrastructure & Services (0.93)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.48)

Add feedback

2ab47c960bfee4f86dfc362f26ad066a-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 06:01:06 GMT

artificial intelligence, machine learning, trajectory, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.71)

Add feedback

Fully Sparse Detection

Neural Information Processing SystemsApr-24-2026, 07:54:42 GMT

As the perception range of LiDAR increases, LiDAR-based 3D object detection becomes a dominant task in the long-range perception task of autonomous driving. The mainstream 3D object detectors usually build dense feature maps in the network backbone and prediction head. However, the computational and spatial costs on the dense feature map are quadratic to the perception range, which makes them hardly scale up to the long-range setting. To enable efficient long-range LiDAR-based object detection, we build a fully sparse 3D object detector (FSD). The computational and spatial cost of FSD is roughly linear to the number of points and independent of the perception range. FSD is built upon the general sparse voxel encoder and a novel sparse instance recognition (SIR) module.

artificial intelligence, detection, machine learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.67)

Industry:

Information Technology (0.48)
Transportation > Ground > Road (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.87)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.34)

Add feedback

DeMo: Decoupling Motion Forecasting into Directional Intentions and Dynamic States

Neural Information Processing SystemsFeb-17-2026, 22:40:21 GMT

Li Zhang (lizhangfd@fudan.edu.cn) is the corresponding author. Previous methods, as depicted in (a), use only one mode query for each trajectory.

machine learning, natural language, trajectory, (20 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

8ec078518dcce6be1324cfd3de11ed24-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 14:15:49 GMT

machine learning, natural language, trajectory, (20 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

2ab47c960bfee4f86dfc362f26ad066a-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 01:46:11 GMT

agent, static intention point, trajectory, (12 more...)

Neural Information Processing Systems

Country: Europe > Germany > Saarland (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.54)

Add feedback

SDTagNet: Leveraging Text-Annotated Navigation Maps for Online HD Map Construction

Immel, Fabian, Pauls, Jan-Hendrik, Fehler, Richard, Bieder, Frank, Merkert, Jonas, Stiller, Christoph

arXiv.org Artificial IntelligenceOct-22-2025

Autonomous vehicles rely on detailed and accurate environmental information to operate safely. High definition (HD) maps offer a promising solution, but their high maintenance cost poses a significant barrier to scalable deployment. This challenge is addressed by online HD map construction methods, which generate local HD maps from live sensor data. However, these methods are inherently limited by the short perception range of onboard sensors. To overcome this limitation and improve general performance, recent approaches have explored the use of standard definition (SD) maps as prior, which are significantly easier to maintain. We propose SDTagNet, the first online HD map construction method that fully utilizes the information of widely available SD maps, like OpenStreetMap, to enhance far range detection accuracy. Our approach introduces two key innovations. First, in contrast to previous work, we incorporate not only polyline SD map data with manually selected classes, but additional semantic information in the form of textual annotations. In this way, we enrich SD vector map tokens with NLP-derived features, eliminating the dependency on predefined specifications or exhaustive class taxonomies. Second, we introduce a point-level SD map encoder together with orthogonal element identifiers to uniformly integrate all types of map elements. Experiments on Argoverse 2 and nuScenes show that this boosts map perception performance by up to +5.9 mAP (+45%) w.r.t. map construction without priors and up to +3.2 mAP (+20%) w.r.t. previous approaches that already use SD map priors. Code is available at https://github.com/immel-f/SDTagNet

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.08997

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.93)
Transportation > Infrastructure & Services (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.34)

Add feedback

CAMNet: Leveraging Cooperative Awareness Messages for Vehicle Trajectory Prediction

Grasselli, Mattia, Porrello, Angelo, Grazia, Carlo Augusto

arXiv.org Artificial IntelligenceOct-15-2025

Autonomous driving remains a challenging task, particularly due to safety concerns. Modern vehicles are typically equipped with expensive sensors such as LiDAR, cameras, and radars to reduce the risk of accidents. However, these sensors face inherent limitations: their field of view and line of sight can be obstructed by other vehicles, thereby reducing situational awareness. In this context, vehicle-to-vehicle communication plays a crucial role, as it enables cars to share information and remain aware of each other even when sensors are occluded. One way to achieve this is through the use of Cooperative Awareness Messages (CAMs). In this paper, we investigate the use of CAM data for vehicle trajectory prediction. Specifically, we design and train a neural network, Cooperative Awareness Message-based Graph Neural Network (CAMNet), on a widely used motion forecasting dataset. We then evaluate the model on a second dataset that we created from scratch using Cooperative Awareness Messages, in order to assess whether this type of data can be effectively exploited. Our approach demonstrates promising results, showing that CAMs can indeed support vehicle trajectory prediction. At the same time, we discuss several limitations of the approach, which highlight opportunities for future research.

artificial intelligence, machine learning, vehicle, (17 more...)

arXiv.org Artificial Intelligence

2510.12703

Country: