A Simple Aerial Detection Baseline of Multimodal Language Models

Open in new window