Hu, Senkang
Sense4FL: Vehicular Crowdsensing Enhanced Federated Learning for Autonomous Driving
Ma, Yanan, Hu, Senkang, Fang, Zhengru, Ji, Yun, Deng, Yiqin, Fang, Yuguang
To accommodate constantly changing road conditions, real-time model training is essential for autonomous driving (AD). Federated learning (FL) serves as a promising paradigm to enable autonomous vehicles to train models collaboratively with their onboard computing resources. However, existing vehicle selection schemes for FL all assume predetermined and location-independent vehicles' datasets, neglecting the fact that vehicles collect training data along their routes, thereby resulting in suboptimal vehicle selection. To improve the perception quality in AD for a region, we propose Sense4FL, a vehicular crowdsensing-enhanced FL framework featuring trajectory-dependent vehicular training data collection. To this end, we first derive the convergence bound of FL by considering the impact of both vehicles' uncertain trajectories and uploading probabilities, from which we discover that minimizing the training loss is equivalent to minimizing a weighted sum of local and global earth mover's distance (EMD) between vehicles' collected data distribution and global data distribution. Based on this observation, we formulate the trajectory-dependent vehicle selection and data collection problem for FL in AD. Given that the problem is NP-hard, we develop an efficient algorithm to find the solution with an approximation guarantee. Extensive simulation results have demonstrated the effectiveness of our approach in improving object detection performance compared with existing benchmarks.
CP-Guard+: A New Paradigm for Malicious Agent Detection and Defense in Collaborative Perception
Hu, Senkang, Tao, Yihang, Fang, Zihan, Xu, Guowen, Deng, Yiqin, Kwong, Sam, Fang, Yuguang
Collaborative perception (CP) is a promising method for safe connected and autonomous driving, which enables multiple vehicles to share sensing information to enhance perception performance. However, compared with single-vehicle perception, the openness of a CP system makes it more vulnerable to malicious attacks that can inject malicious information to mislead the perception of an ego vehicle, resulting in severe risks for safe driving. To mitigate such vulnerability, we first propose a new paradigm for malicious agent detection that effectively identifies malicious agents at the feature level without requiring verification of final perception results, significantly reducing computational overhead. Building on this paradigm, we introduce CP-GuardBench, the first comprehensive dataset provided to train and evaluate various malicious agent detection methods for CP systems. Furthermore, we develop a robust defense method called CP-Guard+, which enhances the margin between the representations of benign and malicious features through a carefully designed Dual-Centered Contrastive Loss (DCCLoss). Finally, we conduct extensive experiments on both CP-GuardBench and V2X-Sim, and demonstrate the superiority of CP-Guard+.
CP-Guard: Malicious Agent Detection and Defense in Collaborative Bird's Eye View Perception
Hu, Senkang, Tao, Yihang, Xu, Guowen, Deng, Yiqin, Chen, Xianhao, Fang, Yuguang, Kwong, Sam
Collaborative Perception (CP) has shown a promising technique for autonomous driving, where multiple connected and autonomous vehicles (CAVs) share their perception information to enhance the overall perception performance and expand the perception range. However, in CP, ego CAV needs to receive messages from its collaborators, which makes it easy to be attacked by malicious agents. For example, a malicious agent can send harmful information to the ego CAV to mislead it. To address this critical issue, we propose a novel method, \textbf{CP-Guard}, a tailored defense mechanism for CP that can be deployed by each agent to accurately detect and eliminate malicious agents in its collaboration network. Our key idea is to enable CP to reach a consensus rather than a conflict against the ego CAV's perception results. Based on this idea, we first develop a probability-agnostic sample consensus (PASAC) method to effectively sample a subset of the collaborators and verify the consensus without prior probabilities of malicious agents. Furthermore, we define a collaborative consistency loss (CCLoss) to capture the discrepancy between the ego CAV and its collaborators, which is used as a verification criterion for consensus. Finally, we conduct extensive experiments in collaborative bird's eye view (BEV) tasks and our results demonstrate the effectiveness of our CP-Guard.
IC3M: In-Car Multimodal Multi-object Monitoring for Abnormal Status of Both Driver and Passengers
Fang, Zihan, Lin, Zheng, Hu, Senkang, Cao, Hangcheng, Deng, Yiqin, Chen, Xianhao, Fang, Yuguang
Recently, in-car monitoring has emerged as a promising technology for detecting early-stage abnormal status of the driver and providing timely alerts to prevent traffic accidents. Although training models with multimodal data enhances the reliability of abnormal status detection, the scarcity of labeled data and the imbalance of class distribution impede the extraction of critical abnormal state features, significantly deteriorating training performance. Furthermore, missing modalities due to environment and hardware limitations further exacerbate the challenge of abnormal status identification. More importantly, monitoring abnormal health conditions of passengers, particularly in elderly care, is of paramount importance but remains underexplored. To address these challenges, we introduce our IC3M, an efficient camera-rotation-based multimodal framework for monitoring both driver and passengers in a car. Our IC3M comprises two key modules: an adaptive threshold pseudo-labeling strategy and a missing modality reconstruction. The former customizes pseudo-labeling thresholds for different classes based on the class distribution, generating class-balanced pseudo labels to guide model training effectively, while the latter leverages crossmodality relationships learned from limited labels to accurately recover missing modalities by distribution transferring from available modalities. Extensive experimental results demonstrate that IC3M outperforms state-of-the-art benchmarks in accuracy, precision, and recall while exhibiting superior robustness under limited labeled data and severe missing modality.
Channel-Aware Throughput Maximization for Cooperative Data Fusion in CAV
An, Haonan, Fang, Zhengru, Zhang, Yuang, Hu, Senkang, Chen, Xianhao, Xu, Guowen, Fang, Yuguang
--Connected and autonomous vehicles (CA Vs) have garnered significant attention due to their extended perception range and enhanced sensing coverage. T o address challenges such as blind spots and obstructions, CA Vs employ vehicle-to-vehicle (V2V) communications to aggregate sensory data from surrounding vehicles. However, cooperative perception is often constrained by the limitations of achievable network throughput and channel quality. In this paper, we propose a channel-aware throughput maximization approach to facilitate CA V data fusion, leveraging a self-supervised autoencoder for adaptive data compression. We formulate the problem as a mixed integer programming (MIP) model, which we decompose into two sub-problems to derive optimal data rate and compression ratio solutions under given link conditions. An autoencoder is then trained to minimize bitrate with the determined compression ratio, and a fine-tuning strategy is employed to further reduce spectrum resource consumption. Experimental evaluation on the OpenCOOD platform demonstrates the effectiveness of our proposed algorithm, showing more than 20.19% improvement in network throughput and a 9.38% increase in average precision (AP@IoU) compared to state-of-the-art methods, with an optimal latency of 19.99 ms. Index T erms --Cooperative perception, throughput optimization, connected and autonomous driving (CA V). Recently, autonomous driving has emerged as a promising technology for smart cities. By leveraging communication and artificial intelligence (AI) technologies, autonomous driving can significantly enhance the performance of a city's transportation system. This improvement is achieved through real-time perception of road conditions and precise object detection from onboard sensors (such as radars, LiDARs, and cameras), thereby improving road safety without human intervention [1]. Moreover, the ability of autonomous vehicles to adapt to dynamic environments and communicate with surrounding infrastructure and vehicles is crucial for maintaining the timeliness and accuracy of collected data, thereby enhancing the overall system performance [2]-[9]. Joint perception among connected and autonomous vehicles (CA Vs) is a key enabler to overcome the limitations of individual agent sensing capabilities [10].
AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning
Hu, Senkang, Fang, Zhengru, Fang, Zihan, Deng, Yiqin, Chen, Xianhao, Fang, Yuguang
Connected and autonomous driving is developing rapidly in recent years. However, current autonomous driving systems, which are primarily based on data-driven approaches, exhibit deficiencies in interpretability, generalization, and continuing learning capabilities. In addition, the single-vehicle autonomous driving systems lack of the ability of collaboration and negotiation with other vehicles, which is crucial for the safety and efficiency of autonomous driving systems. In order to address these issues, we leverage large language models (LLMs) to develop a novel framework, AgentsCoDriver, to enable multiple vehicles to conduct collaborative driving. AgentsCoDriver consists of five modules: observation module, reasoning engine, cognitive memory module, reinforcement reflection module, and communication module. It can accumulate knowledge, lessons, and experiences over time by continuously interacting with the environment, thereby making itself capable of lifelong learning. In addition, by leveraging the communication module, different agents can exchange information and realize negotiation and collaboration in complex traffic environments. Extensive experiments are conducted and show the superiority of AgentsCoDriver.
Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird's Eye View Segmentation for Connected and Autonomous Driving
Hu, Senkang, Fang, Zhengru, Chen, Xianhao, Fang, Yuguang, Kwong, Sam
Collaborative perception has recently gained significant attention in autonomous driving, improving perception quality by enabling the exchange of additional information among vehicles. However, deploying collaborative perception systems can lead to domain shifts due to diverse environmental conditions and data heterogeneity among connected and autonomous vehicles (CAVs). To address these challenges, we propose a unified domain generalization framework applicable in both training and inference stages of collaborative perception. In the training phase, we introduce an Amplitude Augmentation (AmpAug) method to augment low-frequency image variations, broadening the model's ability to learn across various domains. We also employ a meta-consistency training scheme to simulate domain shifts, optimizing the model with a carefully designed consistency loss to encourage domain-invariant representations. In the inference phase, we introduce an intra-system domain alignment mechanism to reduce or potentially eliminate the domain discrepancy among CAVs prior to inference. Comprehensive experiments substantiate the effectiveness of our method in comparison with the existing state-of-the-art works. Code will be released at https://github.com/DG-CAVs/DG-CoPerception.git.
Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving
Hu, Senkang, Fang, Zhengru, An, Haonan, Xu, Guowen, Zhou, Yuan, Chen, Xianhao, Fang, Yuguang
Collaborative perception among multiple connected and autonomous vehicles can greatly enhance perceptive capabilities by allowing vehicles to exchange supplementary information via communications. Despite advances in previous approaches, challenges still remain due to channel variations and data heterogeneity among collaborative vehicles. To address these issues, we propose ACC-DA, a channel-aware collaborative perception framework to dynamically adjust the communication graph and minimize the average transmission delay while mitigating the side effects from the data heterogeneity. Our novelties lie in three aspects. We first design a transmission delay minimization method, which can construct the communication graph and minimize the transmission delay according to different channel information state. We then propose an adaptive data reconstruction mechanism, which can dynamically adjust the rate-distortion trade-off to enhance perception efficiency. Moreover, it minimizes the temporal redundancy during data transmissions. Finally, we conceive a domain alignment scheme to align the data distribution from different vehicles, which can mitigate the domain gap between different vehicles and improve the performance of the target task. Comprehensive experiments demonstrate the effectiveness of our method in comparison to the existing state-of-the-art works.