Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
–Neural Information Processing Systems
Visual tasks vary a lot in their output formats and concerned contents, therefore it is hard to process them with an identical structure. One main obstacle lies in the high-dimensional outputs in object-level visual tasks. In this paper, we propose an object-centric vision framework, Obj2Seq. Obj2Seq takes objects as basic units, and regards most object-level visual tasks as sequence generation problems of objects. Therefore, these visual tasks can be decoupled into two steps.
Neural Information Processing Systems
Oct-9-2024, 18:30:56 GMT
- Technology: