Goto

Collaborating Authors

 Li, Guopeng


Energy Score-based Pseudo-Label Filtering and Adaptive Loss for Imbalanced Semi-supervised SAR target recognition

arXiv.org Artificial Intelligence

Automatic target recognition (ATR) is an important use case for synthetic aperture radar (SAR) image interpretation. Recent years have seen significant advancements in SAR ATR technology based on semi-supervised learning. However, existing semi-supervised SAR ATR algorithms show low recognition accuracy in the case of class imbalance. This work offers a non-balanced semi-supervised SAR target recognition approach using dynamic energy scores and adaptive loss. First, an energy score-based method is developed to dynamically select unlabeled samples near to the training distribution as pseudo-labels during training, assuring pseudo-label reliability in long-tailed distribution circumstances. Secondly, loss functions suitable for class imbalances are proposed, including adaptive margin perception loss and adaptive hard triplet loss, the former offsets inter-class confusion of classifiers, alleviating the imbalance issue inherent in pseudo-label generation. The latter effectively tackles the model's preference for the majority class by focusing on complex difficult samples during training. Experimental results on extremely imbalanced SAR datasets demonstrate that the proposed method performs well under the dual constraints of scarce labels and data imbalance, effectively overcoming the model bias caused by data imbalance and achieving high-precision target recognition.


TAS: Distilling Arbitrary Teacher and Student via a Hybrid Assistant

arXiv.org Artificial Intelligence

Most knowledge distillation (KD) methodologies predominantly focus on teacherstudent pairs with similar architectures, such as both being convolutional neural networks (CNNs). However, the potential and flexibility of KD can be greatly improved by expanding it to novel Cross-Architecture KD (CAKD), where the knowledge of homogeneous and heterogeneous teachers can be transferred flexibly to a given student. The primary challenge in CAKD lies in the substantial feature gaps between heterogeneous models, originating from the distinction of their inherent inductive biases and module functions. To this end, we introduce an assistant model as a bridge to facilitate smooth feature knowledge transfer between heterogeneous teachers and students. More importantly, within our proposed design principle, the assistant model combines the advantages of cross-architecture inductive biases and module functions by merging convolution and attention modules derived from both student and teacher module functions. Furthermore, we observe that heterogeneous features exhibit diverse spatial distributions in CAKD, hindering the effectiveness of conventional pixel-wise mean squared error (MSE) loss. Therefore, we leverage a spatial-agnostic InfoNCE loss to align features after spatial smoothing, thereby improving the feature alignments in CAKD. Our proposed method is evaluated across some homogeneous model pairs and arbitrary heterogeneous combinations of CNNs, ViTs, and MLPs, achieving state-of-the-art performance for distilled models with a maximum gain of 11.47% on CIFAR-100 and 3.67% on ImageNet-1K. Our code and models will be released. Knowledge Distillation (KD) (Hinton et al., 2015; Romero et al., 2015) has been demonstrated as a powerful method to transfer knowledge from a pre-trained and cumbersome teacher model to a compact and efficient student model. Compared to the model trained from scratch, the performance of the student model distilled by appropriate teachers usually improves significantly.


A Conflict Resolution Dataset Derived from Argoverse-2: Analysis of the Safety and Efficiency Impacts of Autonomous Vehicles at Intersections

arXiv.org Artificial Intelligence

As the deployment of autonomous vehicles (AVs) in mixed traffic flow becomes increasingly prevalent, ensuring safe and smooth interactions between AVs and human agents is of critical importance. How road users resolve conflicts at intersections has significant impacts on driving safety and traffic efficiency. These impacts depend on both the behaviours of AVs and humans' reactions to the presence of AVs. Therefore, using real-world data to assess and compare the safety and efficiency measures of AV-involved and AV-free scenarios is crucial. To this end, this paper presents a high-quality conflict resolution dataset derived from the open Argoverse-2 motion forecasting data to analyse the safety and efficiency impacts of AVs. The contribution is twofold: First, we propose and apply a specific data processing pipeline to select scenarios of interest, rectify data errors, and enhance the raw data in Argoverse-2. As a result, 5000+ cases where an AV resolves conflict with a human road user and 16000+ conflict resolution cases without AVs are obtained. Motion data is smooth and consistent in these cases. This open dataset comprises diverse and balanced conflict resolution regimes. Second, this paper employs surrogate safety measures and a novel efficiency measure to assess the impact of AVs at intersections. The results suggest that human drivers exhibit similar safety and efficiency performances when interacting with AVs and with other human drivers. In contrast, pedestrians demonstrate more diverse reactions. Furthermore, due to the safety-prior strategy of AVs, the average efficiency of AV-involved conflict resolution decreases by 8.6% compared to AV-free cases. This informative dataset provides a valuable resource for researchers and the findings give insights into the possible impacts of AVs. The dataset is openly available via https://github.com/RomainLITUD/conflict_resolution_dataset.


Large Car-following Data Based on Lyft level-5 Open Dataset: Following Autonomous Vehicles vs. Human-driven Vehicles

arXiv.org Artificial Intelligence

Car-Following (CF), as a fundamental driving behaviour, has significant influences on the safety and efficiency of traffic flow. Investigating how human drivers react differently when following autonomous vs. human-driven vehicles (HV) is thus critical for mixed traffic flow. Research in this field can be expedited with trajectory datasets collected by Autonomous Vehicles (AVs). However, trajectories collected by AVs are noisy and not readily applicable for studying CF behaviour. This paper extracts and enhances two categories of CF data, HV-following-AV (H-A) and HV-following-HV (H-H), from the open Lyft level-5 dataset. First, CF pairs are selected based on specific rules. Next, the quality of raw data is assessed by anomaly analysis. Then, the raw CF data is corrected and enhanced via motion planning, Kalman filtering, and wavelet denoising. As a result, 29k+ H-A and 42k+ H-H car-following segments are obtained, with a total driving distance of 150k+ km. A diversity assessment shows that the processed data cover complete CF regimes for calibrating CF models. This open and ready-to-use dataset provides the opportunity to investigate the CF behaviours of following AVs vs. HVs from real-world data. It can further facilitate studies on exploring the impact of AVs on mixed urban traffic.


Distil the informative essence of loop detector data set: Is network-level traffic forecasting hungry for more data?

arXiv.org Artificial Intelligence

Network-level traffic condition forecasting has been intensively studied for decades. Although prediction accuracy has been continuously improved with emerging deep learning models and ever-expanding traffic data, traffic forecasting still faces many challenges in practice. These challenges include the robustness of data-driven models, the inherent unpredictability of traffic dynamics, and whether further improvement of traffic forecasting requires more sensor data. In this paper, we focus on this latter question and particularly on data from loop detectors. To answer this, we propose an uncertainty-aware traffic forecasting framework to explore how many samples of loop data are truly effective for training forecasting models. Firstly, the model design combines traffic flow theory with graph neural networks, ensuring the robustness of prediction and uncertainty quantification. Secondly, evidential learning is employed to quantify different sources of uncertainty in a single pass. The estimated uncertainty is used to "distil" the essence of the dataset that sufficiently covers the information content. Results from a case study of a highway network around Amsterdam show that, from 2018 to 2021, more than 80\% of the data during daytime can be removed. The remaining 20\% samples have equal prediction power for training models. This result suggests that indeed large traffic datasets can be subdivided into significantly smaller but equally informative datasets. From these findings, we conclude that the proposed methodology proves valuable in evaluating large traffic datasets' true information content. Further extensions, such as extracting smaller, spatially non-redundant datasets, are possible with this method.


Towards Generic and Controllable Attacks Against Object Detection

arXiv.org Artificial Intelligence

Existing adversarial attacks against Object Detectors (ODs) suffer from two inherent limitations. Firstly, ODs have complicated meta-structure designs, hence most advanced attacks for ODs concentrate on attacking specific detector-intrinsic structures, which makes it hard for them to work on other detectors and motivates us to design a generic attack against ODs. Secondly, most works against ODs make Adversarial Examples (AEs) by generalizing image-level attacks from classification to detection, which brings redundant computations and perturbations in semantically meaningless areas (e.g., backgrounds) and leads to an emergency for seeking controllable attacks for ODs. To this end, we propose a generic white-box attack, LGP (local perturbations with adaptively global attacks), to blind mainstream object detectors with controllable perturbations. For a detector-agnostic attack, LGP tracks high-quality proposals and optimizes three heterogeneous losses simultaneously. In this way, we can fool the crucial components of ODs with a part of their outputs without the limitations of specific structures. Regarding controllability, we establish an object-wise constraint that exploits foreground-background separation adaptively to induce the attachment of perturbations to foregrounds. Experimentally, the proposed LGP successfully attacked sixteen state-of-the-art object detectors on MS-COCO and DOTA datasets, with promising imperceptibility and transferability obtained. Codes are publicly released in https://github.com/liguopeng0923/LGP.git