Zheng, Kang
Scalable Semi-supervised Landmark Localization for X-ray Images using Few-shot Deep Adaptive Graph
Zhou, Xiao-Yun, Lai, Bolin, Li, Weijian, Wang, Yirui, Zheng, Kang, Wang, Fakai, Lin, Chihung, Lu, Le, Huang, Lingyun, Han, Mei, Xie, Guotong, Xiao, Jing, Chang-Fu, Kuo, Harrison, Adam, Miao, Shun
Landmark localization plays an important role in medical image analysis. Learning based methods, including CNN and GCN, have demonstrated the state-of-the-art performance. However, most of these methods are fully-supervised and heavily rely on manual labeling of a large training dataset. In this paper, based on a fully-supervised graph-based method, DAG, we proposed a semi-supervised extension of it, termed few-shot DAG, \ie five-shot DAG. It first trains a DAG model on the labeled data and then fine-tunes the pre-trained model on the unlabeled data with a teacher-student SSL mechanism. In addition to the semi-supervised loss, we propose another loss using JS divergence to regulate the consistency of the intermediate feature maps. We extensively evaluated our method on pelvis, hand and chest landmark detection tasks. Our experiment results demonstrate consistent and significant improvements over previous methods.
Cross-View Person Identification by Matching Human Poses Estimated With Confidence on Each Body Joint
Liang, Guoqiang (Xi'an Jiaotong University) | Lan, Xuguang (University of South Carolina) | Zheng, Kang (Xi'an Jiaotong University,ย Institute of Artificial Intelligence and Robotics) | Wang, Song (University of South Carolina) | Zheng, Nanning (University of South Carolina)
Cross-view person identification (CVPI) from multiple temporally synchronized videos taken by multiple wearable cameras from different, varying views is a very challenging but important problem, which has attracted more interests recently. Current state-of-the-art performance of CVPI is achieved by matching appearance and motion features across videos, while the matching of pose features does not work effectively given the high inaccuracy of the 3D human pose estimation on videos/images collected in the wild. In this paper, we introduce a new metric of confidence to the 3D human pose estimation and show that the combination of the inaccurately estimated human pose and the inferred confidence metric can be used to boost the CVPI performance---the estimated pose information can be integrated to the appearance and motion features to achieve the new state-of-the-art CVPI performance. More specifically, the estimated confidence metric is measured at each human-body joint and the joints with higher confidence are weighted more in the pose matching for CVPI. In the experiments, we validate the proposed method on three wearable-camera video datasets and compare the performance against several other existing CVPI methods.
Co-Saliency Detection Within a Single Image
Yu, Hongkai (University of South Carolina) | Zheng, Kang (University of South Carolina) | Fang, Jianwu (Xi'an Jiaotong University) | Guo, Hao (Chang'an University) | Feng, Wei (University of South Carolina) | Wang, Song (Tianjin University)
Recently, saliency detection in a single image and co-saliency detection in multiple images have drawn extensive research interest in the vision community. In this paper, we investigate a new problem of co-saliency detection within a single image, i.e., detecting within-image co-saliency. By identifying common saliency within an image, e.g., highlighting multiple occurrences of an object class with similar appearance, this work can benefit many important applications, such as the detection of objects of interest, more robust object recognition, reduction of information redundancy, and animation synthesis. We propose a new bottom-up method to address this problem. Specifically, a large number of object proposals are first detected from the image. Then we develop an optimization algorithm to derive a set of proposal groups, each of which contains multiple proposals showing good common saliency in the original image. For each proposal group, we calculate a co-saliency map and then use a low-rank based algorithm to fuse the maps calculated from all the proposal groups for the final co-saliency map in the image. In the experiment, we collect a new dataset of 364 color images with within-image cosaliency. Experiment results show that the proposed method can better detect the within-image co-saliency than existing algorithms.