UFO: AUnified Approach to Fine-grained Visual Perception via Open-ended Language Interface

Neural Information Processing Systems 

Generalist models have achieved remarkable success in both language and visionlanguage tasks, showcasing the potential of unified modeling. However, effectively integrating fine-grained perception tasks like detection and segmentation into these models remains a significant challenge. This is primarily because these tasks often rely heavily on task-specific designs and architectures that can complicate the modeling process. To address this challenge, we present UFO, a framework that Unifies Fine-grained visual perception tasks through an Open-ended language interface.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found