Enabling Near-realtime Remote Sensing via Satellite-Ground Collaboration of Large Vision-Language Models
Li, Zihan, Yang, Jiahao, Zhang, Yuxin, Chen, Zhe, Gao, Yue
–arXiv.org Artificial Intelligence
Large vision-language models (LVLMs) have recently demonstrated great potential in remote sensing (RS) tasks (e.g., disaster monitoring) conducted by low Earth orbit (LEO) satellites. However, their deployment in real-world LEO satellite systems remains largely unexplored, hindered by limited onboard computing resources and brief satellite-ground contacts. We propose Grace, a satellite-ground collaborative system designed for near-realtime LVLM inference in RS tasks. Accordingly, we deploy compact LVLM on satellites for realtime inference, but larger ones on ground stations (GSs) to guarantee end-to-end performance. Grace is comprised of two main phases that are asynchronous satellite-GS Retrieval-Augmented Generation (RAG), and a task dispatch algorithm. Firstly, we still the knowledge archive of GS RAG to satellite archive with tailored adaptive update algorithm during limited satellite-ground data exchange period. Secondly, propose a confidence-based test algorithm that either processes the task onboard the satellite or offloads it to the GS. Extensive experiments based on real-world satellite orbital data show that Grace reduces the average latency by 76-95% compared to state-of-the-art methods, without compromising inference accuracy.
arXiv.org Artificial Intelligence
Oct-29-2025
- Country:
- Asia > China
- North America > United States
- New York > New York County > New York City (0.04)
- Genre:
- Research Report
- New Finding (0.46)
- Promising Solution (0.66)
- Research Report
- Industry:
- Technology: