Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head

Open in new window