Word2Pix: Word to Pixel Cross Attention Transformer in Visual Grounding

Open in new window