Hierarchical Identity Learning for Unsupervised Visible-Infrared Person Re-Identification
Shi, Haonan, Wang, Yubin, Cheng, De, He, Lingfeng, Wang, Nannan, Gao, Xinbo
–arXiv.org Artificial Intelligence
Abstract--Unsupervised visible-infrared person re-identification (USVI-ReID) aims to learn modality-invariant image features from unlabeled cross-modal person datasets by reducing the modality gap while minimizing reliance on costly manual annotations. Existing methods typically address USVI-ReID using cluster-based contrastive learning, which represents a person by a single cluster center . However, they primarily focus on the commonality of images within each cluster while neglecting the finer-grained differences among them. T o address the limitation, we propose a Hierarchical Identity Learning (HIL) framework. Since each cluster may contain several smaller sub-clusters that reflect fine-grained variations among images, we generate multiple memories for each existing coarse-grained cluster via a secondary clustering. Additionally, we propose Multi-Center Contrastive Learning (MCCL) to refine representations for enhancing intra-modal clustering and minimizing cross-modal discrepancies. T o further improve cross-modal matching quality, we design a Bidirectional Reverse Selection Transmission (BRST) mechanism, which establishes reliable cross-modal correspondences by performing bidirectional matching of pseudo-labels. Extensive experiments conducted on the SYSU-MM01 and RegDB datasets demonstrate that the proposed method outperforms existing approaches. ISIBLE-infrared person re-identification (VI-ReID) [1], [2], [3], [4], [5], [6] is an important research direction in the field of computer vision, aiming to match the images of the same person between the visible and infrared modalities.
arXiv.org Artificial Intelligence
Sep-16-2025