CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation

Neural Information Processing Systems 

Taking an image and a natural language sentence as input, a referring image segmentation (RIS) model is required to predict a mask for the object described by the sentence.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found