AITopics | bev feature

e34d908241aef40440e61d2a27715424-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 11:30:25 GMT

distillation, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(5 more...)

Add feedback

Unveiling the Hidden: Online Vectorized HD Map Construction with Clip-Level Token Interaction and Propagation

Neural Information Processing SystemsFeb-18-2026, 03:40:23 GMT

Predicting and constructing road geometric information ( e.g ., lane lines, road markers) is a crucial task for safe autonomous driving, while such static map

artificial intelligence, information, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.88)
Transportation > Ground > Road (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

c49a28241640407b23bba8f2495f4bc9-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 00:41:09 GMT

artificial intelligence, information, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.68)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization

Neural Information Processing SystemsFeb-16-2026, 05:42:26 GMT

Bird's-eye-view (BEV) map layout estimation requires an accurate and full understanding of the semantics for the environmental elements around the ego car to

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
Asia > Singapore (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

7f2fc4053a66edfa430bcdf9a6ff3b17-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 12:19:51 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (0.72)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

5d8c01de2dc698c54201c1c7d0b86974-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 09:53:18 GMT

artificial intelligence, detection, machine learning, (13 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Industry:

Education (0.49)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

CluB: Cluster Meets BEV for LiDAR-Based 3D Object Detection

Neural Information Processing SystemsDec-26-2025, 05:47:19 GMT

Currently, LiDAR-based 3D detectors are broadly categorized into two groups, namely, BEV-based detectors and cluster-based detectors.BEV-based detectors capture the contextual information from the Bird's Eye View (BEV) and fill their center voxels via feature diffusion with a stack of convolution layers, which, however, weakens the capability of presenting an object with the center point.On the other hand, cluster-based detectors exploit the voting mechanism and aggregate the foreground points into object-centric clusters for further prediction.In this paper, we explore how to effectively combine these two complementary representations into a unified framework.Specifically, we propose a new 3D object detection framework, referred to as CluB, which incorporates an auxiliary cluster-based branch into the BEV-based detector by enriching the object representation at both feature and query levels.Technically, CluB is comprised of two steps.First, we construct a cluster feature diffusion module to establish the association between cluster features and BEV features in a subtle and adaptive fashion. Based on that, an imitation loss is introduced to distill object-centric knowledge from the cluster features to the BEV features.Second, we design a cluster query generation module to leverage the voting centers directly from the cluster branch, thus enriching the diversity of object queries.Meanwhile, a direction loss is employed to encourage a more accurate voting center for each cluster.Extensive experiments are conducted on Waymo and nuScenes datasets, and our CluB achieves state-of-the-art performance on both benchmarks.

cluster meet bev, detector, name change, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.41)

Add feedback

BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection

Zhang, Guowen, He, Chenhang, Chen, Liyi, Zhang, Lei

arXiv.org Artificial IntelligenceDec-3-2025

Integrating LiDAR and camera information in the bird's eye view (BEV) representation has demonstrated its effectiveness in 3D object detection. However, because of the fundamental disparity in geometric accuracy between these sensors, indiscriminate fusion in previous methods often leads to degraded performance. In this paper, we propose BEVDi-lation, a novel LiDAR-centric framework that prioritizes Li-DAR information in the fusion. By formulating image BEV features as implicit guidance rather than naive concatenation, our strategy effectively alleviates the spatial misalignment caused by image depth estimation errors. Furthermore, the image guidance can effectively help the LiDAR-centric paradigm to address the sparsity and semantic limitations of point clouds. Specifically, we propose a Sparse V oxel Dilation Block that mitigates the inherent point sparsity by den-sifying foreground voxels through image priors. Moreover, we introduce a Semantic-Guided BEV Dilation Block to enhance the LiDAR feature diffusion processing with image semantic guidance and long-range context capture. On the challenging nuScenes benchmark, BEVDilation achieves better performance than state-of-the-art methods while maintaining competitive computational efficiency. Importantly, our LiDAR-centric strategy demonstrates greater robustness to depth noise compared to naive fusion.

artificial intelligence, detection, proceedings, (12 more...)

arXiv.org Artificial Intelligence

2512.02972

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Map-World: Masked Action planning and Path-Integral World Model for Autonomous Driving

Hu, Bin, Lu, Zijian, Liao, Haicheng, Yuan, Chengran, Rao, Bin, Li, Yongkang, Li, Guofa, Cui, Zhiyong, Xu, Cheng-zhong, Li, Zhenning

arXiv.org Artificial IntelligenceNov-26-2025

Motion planning for autonomous driving must handle multiple plausible futures while remaining computationally efficient. Recent end-to-end systems and world-model-based planners predict rich multi-modal trajectories, but typically rely on handcrafted anchors or reinforcement learning to select a single best mode for training and control. This selection discards information about alternative futures and complicates optimization. We propose MAP-World, a prior-free multi-modal planning framework that couples masked action planning with a path-weighted world model. The Masked Action Planning (MAP) module treats future ego motion as masked sequence completion: past waypoints are encoded as visible tokens, future waypoints are represented as mask tokens, and a driving-intent path provides a coarse scaffold. A compact latent planning state is expanded into multiple trajectory queries with injected noise, yielding diverse, temporally consistent modes without anchor libraries or teacher policies. A lightweight world model then rolls out future BEV semantics conditioned on each candidate trajectory. During training, semantic losses are computed as an expectation over modes, using trajectory probabilities as discrete path weights, so the planner learns from the full distribution of plausible futures instead of a single selected path. On NAVSIM, our method matches anchor-based approaches and achieves state-of-the-art performance among world-model-based methods, while avoiding reinforcement learning and maintaining real-time inference latency.

artificial intelligence, autonomous driving, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2511.20156

Country: Asia > China (0.46)

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Road (0.75)
Information Technology > Robotics & Automation (0.65)
Automobiles & Trucks (0.65)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Enhancing End-to-End Autonomous Driving with Risk Semantic Distillaion from VLM

Qin, Jack, Wang, Zhitao, Zheng, Yinan, Chen, Keyu, Zhou, Yang, Zhong, Yuanxin, Cheng, Siyuan

arXiv.org Artificial IntelligenceNov-19-2025

The autonomous driving (AD) system has exhibited remarkable performance in complex driving scenarios. However, generalization is still a key limitation for the current system, which refers to the ability to handle unseen scenarios or unfamiliar sensor configurations.Related works have explored the use of Vision-Language Models (VLMs) to address few-shot or zero-shot tasks. While promising, these methods introduce a new challenge: the emergence of a hybrid AD system, where two distinct systems are used to plan a trajectory, leading to potential inconsistencies. Alternative research directions have explored Vision-Language-Action (VLA) frameworks that generate control actions from VLM directly. However, these end-to-end solutions demonstrate prohibitive computational demands. To overcome these challenges, we introduce Risk Semantic Distillation (RSD), a novel framework that leverages VLMs to enhance the training of End-to-End (E2E) AD backbones. By providing risk attention for key objects, RSD addresses the issue of generalization. Specifically, we introduce RiskHead, a plug-in module that distills causal risk estimates from Vision-Language Models into Bird's-Eye-View (BEV) features, yielding interpretable risk-attention maps.This approach allows BEV features to learn richer and more nuanced risk attention representations, which directly enhance the model's ability to handle spatial boundaries and risky objects.By focusing on risk attention, RSD aligns better with human-like driving behavior, which is essential to navigate in complex and dynamic environments. Our experiments on the Bench2Drive benchmark demonstrate the effectiveness of RSD in managing complex and unpredictable driving conditions. Due to the enhanced BEV representations enabled by RSD, we observed a significant improvement in both perception and planning capabilities.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2511.14499

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (0.89)
Information Technology > Robotics & Automation (0.66)
Automobiles & Trucks (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Collaborating Authors

bev feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

e34d908241aef40440e61d2a27715424-Paper-Conference.pdf

Unveiling the Hidden: Online Vectorized HD Map Construction with Clip-Level Token Interaction and Propagation

c49a28241640407b23bba8f2495f4bc9-Paper-Conference.pdf

VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization

7f2fc4053a66edfa430bcdf9a6ff3b17-Paper-Conference.pdf

5d8c01de2dc698c54201c1c7d0b86974-Paper-Conference.pdf

CluB: Cluster Meets BEV for LiDAR-Based 3D Object Detection

BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection

Map-World: Masked Action planning and Path-Integral World Model for Autonomous Driving

Enhancing End-to-End Autonomous Driving with Risk Semantic Distillaion from VLM