FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation

Open in new window