Referring Transformer: A One-step Approach to Multi-task Visual Grounding Muchen Li1,2 Leonid Sigal
–Neural Information Processing Systems
Previous approaches to referring expression comprehension (REC) or segmentation (RES) either suffer from limited performance, due to a two-stage setup, or require the designing of complex task-specific one-stage architectures.
Neural Information Processing Systems
Aug-22-2025, 00:52:15 GMT