From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

Open in new window