GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution
–Neural Information Processing Systems
Ultra-high-resolution (UHR) remote sensing (RS) imagery offers valuable data for Earth observation but pose challenges for existing multimodal foundation models due to two key bottlenecks: (1) limited availability of UHR training data, and (2) token explosion caused by the large image size. To address data scarcity, we introduce **SuperRS-VQA** (avg.
Neural Information Processing Systems
Jun-14-2026, 06:42:21 GMT
- Technology: