CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation
–Neural Information Processing Systems
To demonstrate the effectiveness of our approach, we change the image backbone of CoupAlign to different networks, like Resnet101 [3] and Darknet53 [9], and evaluate it on the RefCOCO validation set. In Tab. 1, we compare our results with the methods using Resnet101 as the image backbone. In Tab. 2, we compare the methods using Darknet53. The results show that CoupAlign still suppresses previous methods when using the same image backbone, which indicates that our CoupAlign is compatible with popular backbones. In our experiment, we use four WPA modules, two of which are in the early encoding stage and the other two are in the late encoding stage.
Neural Information Processing Systems
May-30-2025, 08:03:49 GMT