AITopics | cooperative perception

Collaborating Authors

cooperative perception

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Flow-based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection - Appendix Haibao Y u 1, 2, Yingjuan T ang

Neural Information Processing SystemsFeb-13-2026, 16:56:23 GMT

Mean A verage Precision (mAP). For VIC3D object detection, we focus on the obstacles around the ego vehicle. There are two metrics used for evaluation: BEV@mAP and 3D@mAP . BEV@mAP evaluates the 3D boxes in the bird's-eye view and ignores the In our implementation, we ignore the transmission cost of calibration files and timestamps. For early fusion, we calculate the transmission cost of transmitting raw data.

artificial intelligence, machine learning, transmission cost, (16 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.88)

Add feedback

Flow-based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection - Appendix Haibao Y u 1, 2, Yingjuan T ang

Neural Information Processing SystemsOct-8-2025, 20:57:24 GMT

artificial intelligence, machine learning, transmission cost, (16 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.88)

Add feedback

DRCP: Diffusion on Reinforced Cooperative Perception for Perceiving Beyond Limits

Li, Lantao, Yang, Kang, Song, Rui, Sun, Chen

arXiv.org Artificial IntelligenceSep-30-2025

Abstract-- Cooperative perception enabled by V ehicle-to-Everything communication has shown great promise in enhancing situational awareness for autonomous vehicles and other mobile robotic platforms. Despite recent advances in perception backbones and multi-agent fusion, real-world deployments remain challenged by hard detection cases, exemplified by partial detections and noise accumulation which limit downstream detection accuracy. This work presents Diffusion on Reinforced Cooperative Perception (DRCP), a real-time de-ployable framework designed to address aforementioned issues in dynamic driving environments. The proposed system achieves real-time performance on mobile platforms while significantly improving robustness under challenging conditions. Code will be released in late 2025. I. INTRODUCTION Robotic systems such as autonomous vehicles and mobile agents rely heavily on perception to understand their surroundings and make informed decisions.

artificial intelligence, machine learning, perception, (14 more...)

arXiv.org Artificial Intelligence

2509.24903

Genre: Research Report (0.82)

Industry:

Information Technology (0.68)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

V2V-GoT: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multimodal Large Language Models and Graph-of-Thoughts

Chiu, Hsu-kuang, Hachiuma, Ryo, Wang, Chien-Yi, Wang, Yu-Chiang Frank, Chen, Min-Hung, Smith, Stephen F.

arXiv.org Artificial IntelligenceSep-26-2025

Abstract-- Current state-of-the-art autonomous vehicles could face safety-critical situations when their local sensors are occluded by large nearby objects on the road. V ehicle-to-vehicle (V2V) cooperative autonomous driving has been proposed as a means of addressing this problem, and one recently introduced framework for cooperative autonomous driving has further adopted an approach that incorporates a Multimodal Large Language Model (MLLM) to integrate cooperative perception and planning processes. However, despite the potential benefit of applying graph-of-thoughts reasoning to the MLLM, this idea has not been considered by previous cooperative autonomous driving research. In this paper, we propose a novel graph-of-thoughts framework specifically designed for MLLM-based cooperative autonomous driving. Our graph-of-thoughts includes our proposed novel ideas of occlusion-aware perception and planning-aware prediction. We curate the V2V-GoT -QA dataset and develop the V2V-GoT model for training and testing the cooperative driving graph-of-thoughts. Our experimental results show that our method outperforms other baselines in cooperative perception, prediction, and planning tasks. Today's autonomous vehicles rely mainly on mounted cameras or LiDAR sensors to perceive the world, understand the dynamic surrounding scenes, and take driving decisions over time. Inherently such reliance on the vehicle's local sensors can be limiting, particularly in situations where vehicles and other potential obstacles are occluded by other large nearby objects, such as buses or trucks.

artificial intelligence, future trajectory, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.18053

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Research Challenges and Progress in the End-to-End V2X Cooperative Autonomous Driving Competition

Hao, Ruiyang, Yu, Haibao, Zhong, Jiaru, Wang, Chuanye, Wang, Jiahao, Kan, Yiming, Yang, Wenxian, Fan, Siqi, Yin, Huilin, Qiu, Jianing, Mu, Yao, Sun, Jiankai, Chen, Li, Zimmer, Walter, Zhang, Dandan, Zhang, Shanghang, Schwager, Mac, Luo, Ping, Nie, Zaiqing

arXiv.org Artificial IntelligenceAug-19-2025

With the rapid advancement of autonomous driving technology, vehicle-to-everything (V2X) communication has emerged as a key enabler for extending perception range and enhancing driving safety by providing visibility beyond the line of sight. However, integrating multi-source sensor data from both ego-vehicles and infrastructure under real-world constraints, such as limited communication bandwidth and dynamic environments, presents significant technical challenges. T o facilitate research in this area, we organized the End-to-End Autonomous Driving through V2X Cooperation Challenge, which features two tracks: cooperative temporal perception and cooperative end-to-end planning. Built on the UniV2X framework and the V2X-Seq-SPD dataset, the challenge attracted participation from over 30 teams worldwide and established a unified benchmark for evaluating cooperative driving systems. This paper describes the design and outcomes of the challenge, highlights key research problems including bandwidth-aware fusion, robust multi-agent planning, and heterogeneous sensor integration, and analyzes emerging technical trends among top-performing solutions. By addressing practical constraints in communication and data fusion, the challenge contributes to the development of scalable and reliable V2X-cooperative autonomous driving systems.

artificial intelligence, autonomous driving, information fusion, (12 more...)

arXiv.org Artificial Intelligence

2507.2161

Country:

Asia (0.93)
North America > United States (0.93)

Genre:

Overview (0.93)
Research Report (0.64)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.88)

Add feedback

TruckV2X: A Truck-Centered Perception Dataset

Xie, Tenghui, Song, Zhiying, Wen, Fuxi, Li, Jun, Liu, Guangzhao, Zhao, Zijian

arXiv.org Artificial IntelligenceAug-11-2025

--Autonomous trucking offers significant benefits, such as improved safety and reduced costs, but faces unique perception challenges due to trucks' large size and dynamic trailer movements. These challenges include extensive blind spots and occlusions that hinder the truck's perception and the capabilities of other road users. T o address these limitations, cooperative perception emerges as a promising solution. However, existing datasets predominantly feature light vehicle interactions or lack multi-agent configurations for heavy-duty vehicle scenarios. T o bridge this gap, we introduce TruckV2X, the first large-scale truck-centered cooperative perception dataset featuring multi-modal sensing (LiDAR and cameras) and multi-agent cooperation (tractors, trailers, CA Vs, and RSUs). We further investigate how trucks influence collaborative perception needs, establishing performance benchmarks while suggesting research priorities for heavy vehicle perception. The dataset provides a foundation for developing cooperative perception systems with enhanced occlusion handling capabilities, and accelerates the deployment of multi-agent autonomous trucking systems. UTONOMOUS trucking is expected to benefit the logistics industry in improved road safety, reduced operational costs, and solutions to driver shortages [1].

artificial intelligence, machine learning, vehicle, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2025.3592884

2507.09505

Country: North America > United States (0.28)

Genre: Research Report (0.84)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Freight & Logistics Services (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Cooperative Perception: A Resource-Efficient Framework for Multi-Drone 3D Scene Reconstruction Using Federated Diffusion and NeRF

Pourmandi, Massoud

arXiv.org Artificial IntelligenceAug-5-2025

The proposal introduces an innovative drone swarm perception system that aims to solve problems related to computational limitations and low-bandwidth communication, and real-time scene reconstruction. The framework enables efficient multi-agent 3D/4D scene synthesis through federated learning of shared diffusion model and YOLOv12 lightweight semantic extraction and local NeRF updates while maintaining privacy and scalability. The framework redesigns generative diffusion models for joint scene reconstruction, and improves cooperative scene understanding, while adding semantic-aware compression protocols. The approach can be validated through simulations and potential real-world deployment on drone testbeds, positioning it as a disruptive advancement in multi-agent AI for autonomous systems.

artificial intelligence, arxiv preprint arxiv, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2508.00967

Genre:

Research Report (0.64)
Overview (0.46)

Industry:

Information Technology (0.94)
Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Add feedback

HeCoFuse: Cross-Modal Complementary V2X Cooperative Perception with Heterogeneous Sensors

Wei, Chuheng, Qin, Ziye, Zimmer, Walter, Wu, Guoyuan, Barth, Matthew J.

arXiv.org Artificial IntelligenceJul-21-2025

Real-world Vehicle-to-Everything (V2X) cooperative perception systems often operate under heterogeneous sensor configurations due to cost constraints and deployment variability across vehicles and infrastructure. This heterogeneity poses significant challenges for feature fusion and perception reliability. To address these issues, we propose HeCoFuse, a unified framework designed for cooperative perception across mixed sensor setups where nodes may carry Cameras (C), LiDARs (L), or both. By introducing a hierarchical fusion mechanism that adaptively weights features through a combination of channel-wise and spatial attention, HeCoFuse can tackle critical challenges such as cross-modality feature misalignment and imbalanced representation quality. In addition, an adaptive spatial resolution adjustment module is employed to balance computational cost and fusion effectiveness. To enhance robustness across different configurations, we further implement a cooperative learning strategy that dynamically adjusts fusion type based on available modalities. Experiments on the real-world TUMTraf-V2X dataset demonstrate that HeCoFuse achieves 43.22% 3D mAP under the full sensor configuration (LC+LC), outperforming the CoopDet3D baseline by 1.17%, and reaches an even higher 43.38% 3D mAP in the L+LC scenario, while maintaining 3D mAP in the range of 21.74% to 43.38% across nine heterogeneous sensor configurations. These results, validated by our first-place finish in the CVPR 2025 DriveX challenge, establish HeCoFuse as the current state-of-the-art on TUM-Traf V2X dataset while demonstrating robust performance across diverse sensor deployments.

artificial intelligence, configuration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.13677

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry:

Transportation (0.69)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science > Data Integration (0.94)
(3 more...)

Add feedback

CoInfra: A Large-Scale Cooperative Infrastructure Perception System and Dataset in Adverse Weather

Ning, Minghao, Yang, Yufeng, Shu, Keqi, Huang, Shucheng, Zhong, Jiaming, Salehi, Maryam, Rahmani, Mahdi, Lu, Yukun, Sun, Chen, Saleh, Aladdin, Hashemi, Ehsan, Khajepour, Amir

arXiv.org Artificial IntelligenceJul-8-2025

We present CoInfra, a large-scale cooperative infrastructure perception system and dataset designed to advance robust multi-agent perception under real-world and adverse weather conditions. The CoInfra system includes 14 fully synchronized sensor nodes, each equipped with dual RGB cameras and a LiDAR, deployed across a shared region and operating continuously to capture all traffic participants in real-time. A robust, delay-aware synchronization protocol and a scalable system architecture that supports real-time data fusion, OTA management, and remote monitoring are provided in this paper. On the other hand, the dataset was collected in different weather scenarios, including sunny, rainy, freezing rain, and heavy snow and includes 195k LiDAR frames and 390k camera images from 8 infrastructure nodes that are globally time-aligned and spatially calibrated. Furthermore, comprehensive 3D bounding box annotations for five object classes (i.e., car, bus, truck, person, and bicycle) are provided in both global and individual node frames, along with high-definition maps for contextual understanding. Baseline experiments demonstrate the trade-offs between early and late fusion strategies, the significant benefits of HD map integration are discussed. By openly releasing our dataset, codebase, and system documentation at https://github.com/NingMingHao/CoInfra, we aim to enable reproducible research and drive progress in infrastructure-supported autonomous driving, particularly in challenging, real-world settings.

artificial intelligence, node, real time system, (15 more...)

arXiv.org Artificial Intelligence

2507.02245

Country:

Asia > China (0.68)
North America > Canada > Ontario (0.68)

Genre: Research Report (0.82)

Industry:

Information Technology (1.00)
Transportation > Ground > Road (0.89)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

AirV2X: Unified Air-Ground Vehicle-to-Everything Collaboration

Gao, Xiangbo, Wu, Yuheng, Yang, Fengze, Luo, Xuewen, Wu, Keshu, Chen, Xinghao, Wang, Yuping, Liu, Chenxi, Zhou, Yang, Tu, Zhengzhong

arXiv.org Artificial IntelligenceJul-4-2025

While multi-vehicular collaborative driving demonstrates clear advantages over single-vehicle autonomy, traditional infrastructure-based V2X systems remain constrained by substantial deployment costs and the creation of "uncovered danger zones" in rural and suburban areas. We present AirV2X-Perception, a large-scale dataset that leverages Unmanned Aerial Vehicles (UAVs) as a flexible alternative or complement to fixed Road-Side Units (RSUs). Drones offer unique advantages over ground-based perception: complementary bird's-eye-views that reduce occlusions, dynamic positioning capabilities that enable hovering, patrolling, and escorting navigation rules, and significantly lower deployment costs compared to fixed infrastructure. Our dataset comprises 6.73 hours of drone-assisted driving scenarios across urban, suburban, and rural environments with varied weather and lighting conditions. The AirV2X-Perception dataset facilitates the development and standardized evaluation of Vehicle-to-Drone (V2D) algorithms, addressing a critical gap in the rapidly expanding field of aerial-assisted autonomous driving systems. The dataset and development kits are open-sourced at https://github.com/taco-group/AirV2X-Perception.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.19283

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Energy (1.00)
Transportation > Ground > Road (0.89)
Automobiles & Trucks (0.89)
Information Technology > Robotics & Automation (0.69)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
(3 more...)

Add feedback