Pang, Junbiao
Uncertainty-aware Long-tailed Weights Model the Utility of Pseudo-labels for Semi-supervised Learning
Wu, Jiaqi, Pang, Junbiao, Huang, Qingming
Current Semi-supervised Learning (SSL) adopts the pseudo-labeling strategy and further filters pseudo-labels based on confidence thresholds. However, this mechanism has notable drawbacks: 1) setting the reasonable threshold is an open problem which significantly influences the selection of the high-quality pseudo-labels; and 2) deep models often exhibit the over-confidence phenomenon which makes the confidence value an unreliable indicator for assessing the quality of pseudo-labels due to the scarcity of labeled data. In this paper, we propose an Uncertainty-aware Ensemble Structure (UES) to assess the utility of pseudo-labels for unlabeled samples. We further model the utility of pseudo-labels as long-tailed weights to avoid the open problem of setting the threshold. Concretely, the advantage of the long-tailed weights ensures that even unreliable pseudo-labels still contribute to enhancing the model's robustness. Besides, UES is lightweight and architecture-agnostic, easily extending to various computer vision tasks, including classification and regression. Experimental results demonstrate that combining the proposed method with DualPose leads to a 3.47% improvement in Percentage of Correct Keypoints (PCK) on the Sniffing dataset with 100 data points (30 labeled), a 7.29\% improvement in PCK on the FLIC dataset with 100 data points (50 labeled), and a 3.91% improvement in PCK on the LSP dataset with 200 data points (100 labeled). Furthermore, when combined with FixMatch, the proposed method achieves a 0.2% accuracy improvement on the CIFAR-10 dataset with 40 labeled data points and a 0.26% accuracy improvement on the CIFAR-100 dataset with 400 labeled data points.
Unsupervised Abnormal Stop Detection for Long Distance Coaches with Low-Frequency GPS
Deng, Jiaxin, Pang, Junbiao, Xu, Jiayu, Yu, Haitao
In our urban life, long distance coaches supply a convenient yet economic approach to the transportation of the public. One notable problem is to discover the abnormal stop of the coaches due to the important reason, i.e., illegal pick up on the way which possibly endangers the safety of passengers. It has become a pressing issue to detect the coach abnormal stop with low-quality GPS. In this paper, we propose an unsupervised method that helps transportation managers to efficiently discover the Abnormal Stop Detection (ASD) for long distance coaches. Concretely, our method converts the ASD problem into an unsupervised clustering framework in which both the normal stop and the abnormal one are decomposed. Firstly, we propose a stop duration model for the low frequency GPS based on the assumption that a coach changes speed approximately in a linear approach. Secondly, we strip the abnormal stops from the normal stop points by the low rank assumption. The proposed method is conceptually simple yet efficient, by leveraging low rank assumption to handle normal stop points, our approach enables domain experts to discover the ASD for coaches, from a case study motivated by traffic managers. Datset and code are publicly available at: https://github.com/pangjunbiao/IPPs.
Bilateral Sharpness-Aware Minimization for Flatter Minima
Deng, Jiaxin, Pang, Junbiao, Zhang, Baochang, Huang, Qingming
Sharpness-Aware Minimization (SAM) enhances generalization by reducing a Max-Sharpness (MaxS). Despite the practical success, we empirically found that the MAxS behind SAM's generalization enhancements face the "Flatness Indicator Problem" (FIP), where SAM only considers the flatness in the direction of gradient ascent, resulting in a next minimization region that is not sufficiently flat. A better Flatness Indicator (FI) would bring a better generalization of neural networks. Because SAM is a greedy search method in nature. In this paper, we propose to utilize the difference between the training loss and the minimum loss over the neighborhood surrounding the current weight, which we denote as Min-Sharpness (MinS). By merging MaxS and MinS, we created a better FI that indicates a flatter direction during the optimization. Specially, we combine this FI with SAM into the proposed Bilateral SAM (BSAM) which finds a more flatter minimum than that of SAM. The theoretical analysis proves that BSAM converges to local minima. Extensive experiments demonstrate that BSAM offers superior generalization performance and robustness compared to vanilla SAM across various tasks, i.e., classification, transfer learning, human pose estimation, and network quantization. Code is publicly available at: https://github.com/ajiaaa/BSAM.
Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization
Deng, Jiaxin, Pang, Junbiao, Zhang, Baochang
Sharpness-Aware Minimization (SAM) has emerged as a promising approach for effectively reducing the generalization error. However, SAM incurs twice the computational cost compared to base optimizer (e.g., SGD). We propose Asymptotic Unbiased Sampling with respect to iterations to accelerate SAM (AUSAM), which maintains the model's generalization capacity while significantly enhancing computational efficiency. Concretely, we probabilistically sample a subset of data points beneficial for SAM optimization based on a theoretically guaranteed criterion, i.e., the Gradient Norm of each Sample (GNS). We further approximate the GNS by the difference in loss values before and after perturbation in SAM. As a plug-and-play, architecture-agnostic method, our approach consistently accelerates SAM across a range of tasks and networks, i.e., classification, human pose estimation and network quantization. On CIFAR10/100 and Tiny-ImageNet, AUSAM achieves results comparable to SAM while providing a speedup of over 70%. Compared to recent dynamic data pruning methods, AUSAM is better suited for SAM and excels in maintaining performance. Additionally, AUSAM accelerates optimization in human pose estimation and model quantization without sacrificing performance, demonstrating its broad practicality.
Pixel-Wise Symbol Spotting via Progressive Points Location for Parsing CAD Images
Pang, Junbiao, Dong, Zailin, Deng, Jiaxin, Zhu, Mengyuan, Zhang, Yunwei
Parsing Computer-Aided Design (CAD) drawings is a fundamental step for CAD revision, semantic-based management, and the generation of 3D prototypes in both the architecture and engineering industries. Labeling symbols from a CAD drawing is a challenging yet notorious task from a practical point of view. In this work, we propose to label and spot symbols from CAD images that are converted from CAD drawings. The advantage of spotting symbols from CAD images lies in the low requirement of labelers and the low-cost annotation. However, pixel-wise spotting symbols from CAD images is challenging work. We propose a pixel-wise point location via Progressive Gaussian Kernels (PGK) to balance between training efficiency and location accuracy. Besides, we introduce a local offset to the heatmap-based point location method. Based on the keypoints detection, we propose a symbol grouping method to redraw the rectangle symbols in CAD images. We have released a dataset containing CAD images of equipment rooms from telecommunication industrial CAD drawings. Extensive experiments on this real-world dataset show that the proposed method has good generalization ability.
A Channel-ensemble Approach: Unbiased and Low-variance Pseudo-labels is Critical for Semi-supervised Classification
Wu, Jiaqi, Pang, Junbiao, Zhang, Baochang, Huang, Qingming
Semi-supervised learning (SSL) is a practical challenge in computer vision. Pseudo-label (PL) methods, e.g., FixMatch and FreeMatch, obtain the State Of The Art (SOTA) performances in SSL. These approaches employ a threshold-to-pseudo-label (T2L) process to generate PLs by truncating the confidence scores of unlabeled data predicted by the self-training method. However, self-trained models typically yield biased and high-variance predictions, especially in the scenarios when a little labeled data are supplied. To address this issue, we propose a lightweight channel-based ensemble method to effectively consolidate multiple inferior PLs into the theoretically guaranteed unbiased and low-variance one. Importantly, our approach can be readily extended to any SSL framework, such as FixMatch or FreeMatch. Experimental results demonstrate that our method significantly outperforms state-of-the-art techniques on CIFAR10/100 in terms of effectiveness and efficiency.
Effective Gradient Sample Size via Variation Estimation for Accelerating Sharpness aware Minimization
Deng, Jiaxin, Pang, Junbiao, Zhang, Baochang, Wang, Tian
However, SAM requires two forward In recent years, some research has been proposed to understand and backward operations in one iteration, which results in the generalization of DNNs [Keskar et al., 2017; SAM's optimization speed being only half that of SGD. In Zhang et al., 2021; Mulayoff and Michaeli, 2020; Andriushchenko some scenarios, dedicating twice the training time to achieve and Flammarion, 2022; Zhou et al., 2021; only a marginal improvement in accuracy may not strike an Zhou et al., 2022]. Several studies have verified the relationship optimal balance between accuracy and efficiency. For example, between flat minima and generalization error [Dinh et when SAM is employed to optimize WideResNet-28-10 al., 2017; Li et al., 2018; Jiang et al., 2020; Liu et al., 2020; on CIFAR-100, despite achieving a higher test accuracy than Sun et al., 2021]. Among these studies, Jiang et al. [Jiang et SGD (84.45% vs. 82.89%), the optimization speed is only al., 2020] explored over 40 complexity measures and demonstrated half that of SGD (343 imgs/s vs. 661 imgs/s), as illustrated in that a sharpness-based measure exhibits the highest Figure 1.
Modeling the Uncertainty with Maximum Discrepant Students for Semi-supervised 2D Pose Estimation
Wu, Jiaqi, Pang, Junbiao, Huang, Qingming
Semi-supervised pose estimation is a practically challenging task for computer vision. Although numerous excellent semi-supervised classification methods have emerged, these methods typically use confidence to evaluate the quality of pseudo-labels, which is difficult to achieve in pose estimation tasks. For example, in pose estimation, confidence represents only the possibility that a position of the heatmap is a keypoint, not the quality of that prediction. In this paper, we propose a simple yet efficient framework to estimate the quality of pseudo-labels in semi-supervised pose estimation tasks from the perspective of modeling the uncertainty of the pseudo-labels. Concretely, under the dual mean-teacher framework, we construct the two maximum discrepant students (MDSs) to effectively push two teachers to generate different decision boundaries for the same sample. Moreover, we create multiple uncertainties to assess the quality of the pseudo-labels. Experimental results demonstrate that our method improves the performance of semi-supervised pose estimation on three datasets.