Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model

Open in new window