Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data

Open in new window