AITopics | Geophysical Analysis & Survey

Collaborating Authors

Geophysical Analysis & Survey

Robust Change Captioning in Remote Sensing: SECOND-CC Dataset and MModalCC Framework

Karaca, Ali Can, Ozelbas, M. Enes, Berber, Saadettin, Karimli, Orkhan, Yildirim, Turabi, Amasyali, M. Fatih

arXiv.org Artificial IntelligenceJan-17-2025

Remote sensing change captioning (RSICC) aims to describe changes between bitemporal images in natural language. Existing methods often fail under challenges like illumination differences, viewpoint changes, blur effects, leading to inaccuracies, especially in no-change regions. Moreover, the images acquired at different spatial resolutions and have registration errors tend to affect the captions. To address these issues, we introduce SECOND-CC, a novel RSICC dataset featuring high-resolution RGB image pairs, semantic segmentation maps, and diverse real-world scenarios. SECOND-CC which contains 6,041 pairs of bitemporal RS images and 30,205 sentences describing the differences between images. Additionally, we propose MModalCC, a multimodal framework that integrates semantic and visual data using advanced attention mechanisms, including Cross-Modal Cross Attention (CMCA) and Multimodal Gated Cross Attention (MGCA). Detailed ablation studies and attention visualizations further demonstrate its effectiveness and ability to address RSICC challenges. Comprehensive experiments show that MModalCC outperforms state-of-the-art RSICC methods, including RSICCformer, Chg2Cap, and PSNet with +4.6% improvement on BLEU4 score and +9.6% improvement on CIDEr score. We will make our dataset and codebase publicly available to facilitate future research at https://github.com/ChangeCapsInRS/SecondCC

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.10075

Country:

Asia > Middle East (0.28)
Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
(2 more...)

Add feedback

ForestProtector: An IoT Architecture Integrating Machine Vision and Deep Reinforcement Learning for Efficient Wildfire Monitoring

Bonilla-Ormachea, Kenneth, Cuizaga, Horacio, Salcedo, Edwin, Castro, Sebastian, Fernandez-Testa, Sergio, Mamani, Misael

arXiv.org Artificial IntelligenceJan-16-2025

Early detection of forest fires is crucial to minimizing the environmental and socioeconomic damage they cause. Indeed, a fire's duration directly correlates with the difficulty and cost of extinguishing it. For instance, a fire burning for 1 minute might require 1 liter of water to extinguish, while a 2-minute fire could demand 100 liters, and a 10-minute fire might necessitate 1,000 liters. On the other hand, existing fire detection systems based on novel technologies (e.g., remote sensing, PTZ cameras, UAVs) are often expensive and require human intervention, making continuous monitoring of large areas impractical. To address this challenge, this work proposes a low-cost forest fire detection system that utilizes a central gateway device with computer vision capabilities to monitor a 360{\deg} field of view for smoke at long distances. A deep reinforcement learning agent enhances surveillance by dynamically controlling the camera's orientation, leveraging real-time sensor data (smoke levels, ambient temperature, and humidity) from distributed IoT devices. This approach enables automated wildfire monitoring across expansive areas while reducing false positives.

detection, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2501.09926

Country: North America > United States (0.29)

Genre:

Research Report (1.00)
Overview > Innovation (0.48)

Industry:

Energy > Oil & Gas (0.68)
Materials > Chemicals (0.46)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Deep Self-Supervised Disturbance Mapping with the OPERA Sentinel-1 Radiometric Terrain Corrected SAR Backscatter Product

Hardiman-Mostow, Harris, Marshak, Charles, Handwerger, Alexander L.

arXiv.org Artificial IntelligenceJan-15-2025

Mapping land surface disturbances supports disaster response, resource and ecosystem management, and climate adaptation efforts. Synthetic aperture radar (SAR) is an invaluable tool for disturbance mapping, providing consistent time-series images of the ground regardless of weather or illumination conditions. Despite SAR's potential for disturbance mapping, processing SAR data to an analysis-ready format requires expertise and significant compute resources, particularly for large-scale global analysis. In October 2023, NASA's Observational Products for End-Users from Remote Sensing Analysis (OPERA) project released the near-global Radiometric Terrain Corrected SAR backscatter from Sentinel-1 (RTC-S1) dataset, providing publicly available, analysis-ready SAR imagery. In this work, we utilize this new dataset to systematically analyze land surface disturbances. As labeling SAR data is often prohibitively time-consuming, we train a self-supervised vision transformer - which requires no labels to train - on OPERA RTC-S1 data to estimate a per-pixel distribution from the set of baseline imagery and assess disturbances when there is significant deviation from the modeled distribution. To test our model's capability and generality, we evaluate three different natural disasters - which represent high-intensity, abrupt disturbances - from three different regions of the world. Across events, our approach yields high quality delineations: F1 scores exceeding 0.6 and Areas Under the Precision-Recall Curve exceeding 0.65, consistently outperforming existing SAR disturbance methods. Our findings suggest that a self-supervised vision transformer is well-suited for global disturbance mapping and can be a valuable tool for operational, near-global disturbance monitoring, particularly when labeled data does not exist.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.09129

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Government > Space Agency (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.38)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

FLAVARS: A Multimodal Foundational Language and Vision Alignment Model for Remote Sensing

Corley, Isaac, Nsutezo, Simone Fobi, Ortiz, Anthony, Robinson, Caleb, Dodhia, Rahul, Ferres, Juan M. Lavista, Najafirad, Peyman

arXiv.org Artificial IntelligenceJan-14-2025

Remote sensing imagery is dense with objects and contextual visual information. There is a recent trend to combine paired satellite images and text captions for pretraining performant encoders for downstream tasks. However, while contrastive image-text methods like CLIP enable vision-language alignment and zero-shot classification ability, vision-only downstream performance tends to degrade compared to image-only pretraining, such as MAE. In this paper, we propose FLAVARS, a pretraining method that combines the best of both contrastive learning and masked modeling, along with geospatial alignment via contrastive location encoding. We find that FLAVARS significantly outperforms a baseline of SkyCLIP for vision-only tasks such as KNN classification and semantic segmentation, +6\% mIOU on SpaceNet1, while retaining the ability to perform zero-shot classification, unlike MAE pretrained methods.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.0849

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.79)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback

deepTerra -- AI Land Classification Made Easy

Wilkinson, Andrew Keith

arXiv.org Artificial IntelligenceJan-14-2025

These comprise the following modules: using machine learning and satellite imagery. The platform includes modules for data collection, Data Collection: This module provides tools to image augmentation, training, testing, and prediction, extract suitable image patches from pre-existing images streamlining the entire workflow for image or download satellite imagery from sources like classification tasks. This paper presents a detailed Google Earth. It also includes features for labeling overview of the capabilities of deepTerra, shows and organizing datasets efficiently. When geographic how it has been applied to various research areas, coordinates are available, they are automatically and discusses the future directions it might take.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.07859

Genre: Research Report (0.40)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.55)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

Multi-Label Scene Classification in Remote Sensing Benefits from Image Super-Resolution

Mudraje, Ashitha, Moser, Brian B., Frolov, Stanislav, Dengel, Andreas

arXiv.org Artificial IntelligenceJan-12-2025

Satellite imagery is a cornerstone for numerous Remote Sensing (RS) applications; however, limited spatial resolution frequently hinders the precision of such systems, especially in multi-label scene classification tasks as it requires a higher level of detail and feature differentiation. In this study, we explore the efficacy of image Super-Resolution (SR) as a pre-processing step to enhance the quality of satellite images and thus improve downstream classification performance. We investigate four SR models - SRResNet, HAT, SeeSR, and RealESRGAN - and evaluate their impact on multi-label scene classification across various CNN architectures, including ResNet-50, ResNet-101, ResNet-152, and Inception-v4. Our results show that applying SR significantly improves downstream classification performance across various metrics, demonstrating its ability to preserve spatial details critical for multi-label tasks. Overall, this work offers valuable insights into the selection of SR techniques for multi-label prediction in remote sensing and presents an easy-to-integrate framework to improve existing RS systems.

artificial intelligence, classification, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.0672

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Advancing ALS Applications with Large-Scale Pre-training: Dataset Development and Downstream Assessment

Xiu, Haoyi, Liu, Xin, Kim, Taehoon, Kim, Kyoung-Sook

arXiv.org Artificial IntelligenceJan-9-2025

The pre-training and fine-tuning paradigm has revolutionized satellite remote sensing applications. However, this approach remains largely underexplored for airborne laser scanning (ALS), an important technology for applications such as forest management and urban planning. In this study, we address this gap by constructing a large-scale ALS point cloud dataset and evaluating its impact on downstream applications. Our dataset comprises ALS point clouds collected across the contiguous United States, provided by the United States Geological Survey's 3D Elevation Program. To ensure efficient data collection while capturing diverse land cover and terrain types, we introduce a geospatial sampling method that selects point cloud tiles based on land cover maps and digital elevation models. As a baseline self-supervised learning model, we adopt BEV-MAE, a state-of-the-art masked autoencoder for 3D outdoor point clouds, and pre-train it on the constructed dataset. The pre-trained models are subsequently fine-tuned for downstream tasks, including tree species classification, terrain scene recognition, and point cloud semantic segmentation. Our results show that the pre-trained models significantly outperform their scratch counterparts across all downstream tasks, demonstrating the transferability of the representations learned from the proposed dataset. Furthermore, we observe that scaling the dataset using our geospatial sampling method consistently enhances performance, whereas pre-training on datasets constructed with random sampling fails to achieve similar improvements. These findings highlight the utility of the constructed dataset and the effectiveness of our sampling strategy in the pre-training and fine-tuning paradigm. The source code and pre-trained models will be made publicly available at \url{https://github.com/martianxiu/ALS_pretraining}.

dataset, point cloud, proceedings, (12 more...)

arXiv.org Artificial Intelligence

2501.05095

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Alaska (0.04)
North America > United States > Indiana (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.88)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Enhancing Scene Classification in Cloudy Image Scenarios: A Collaborative Transfer Method with Information Regulation Mechanism using Optical Cloud-Covered and SAR Remote Sensing Images

Wang, Yuze, Xiao, Rong, Li, Haifeng, Belgiu, Mariana, Tao, Chao

arXiv.org Artificial IntelligenceJan-8-2025

In remote sensing scene classification, leveraging the transfer methods with well-trained optical models is an efficient way to overcome label scarcity. However, cloud contamination leads to optical information loss and significant impacts on feature distribution, challenging the reliability and stability of transferred target models. Common solutions include cloud removal for optical data or directly using Synthetic aperture radar (SAR) data in the target domain. However, cloud removal requires substantial auxiliary data for support and pre-training, while directly using SAR disregards the unobstructed portions of optical data. This study presents a scene classification transfer method that synergistically combines multi-modality data, which aims to transfer the source domain model trained on cloudfree optical data to the target domain that includes both cloudy optical and SAR data at low cost. Specifically, the framework incorporates two parts: (1) the collaborative transfer strategy, based on knowledge distillation, enables the efficient prior knowledge transfer across heterogeneous data; (2) the information regulation mechanism (IRM) is proposed to address the modality imbalance issue during transfer. It employs auxiliary models to measure the contribution discrepancy of each modality, and automatically balances the information utilization of modalities during the target model learning process at the sample-level. The transfer experiments were conducted on simulated and real cloud datasets, demonstrating the superior performance of the proposed method compared to other solutions in cloud-covered scenarios. We also verified the importance and limitations of IRM, and further discussed and visualized the modality imbalance problem during the model transfer. Codes are available at https://github.com/wangyuze-csu/ESCCS

dataset, information, modality, (13 more...)

arXiv.org Artificial Intelligence

2501.04283

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Hunan Province > Changsha (0.04)
Asia > China > Hong Kong (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry:

Education (0.88)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Integrating remote sensing data assimilation, deep learning and large language model for interactive wheat breeding yield prediction

Yang, Guofeng, Jin, Nanfei, Ai, Wenjie, Zheng, Zhonghua, He, Yuhong, He, Yong

arXiv.org Artificial IntelligenceJan-8-2025

Yield is one of the core goals of crop breeding. By predicting the potential yield of different breeding materials, breeders can screen these materials at various growth stages to select the best performing. Based on unmanned aerial vehicle remote sensing technology, high-throughput crop phenotyping data in breeding areas is collected to provide data support for the breeding decisions of breeders. However, the accuracy of current yield predictions still requires improvement, and the usability and user-friendliness of yield forecasting tools remain suboptimal. To address these challenges, this study introduces a hybrid method and tool for crop yield prediction, designed to allow breeders to interactively and accurately predict wheat yield by chatting with a large language model (LLM). First, the newly designed data assimilation algorithm is used to assimilate the leaf area index into the WOFOST model. Then, selected outputs from the assimilation process, along with remote sensing inversion results, are used to drive the time-series temporal fusion transformer model for wheat yield prediction. Finally, based on this hybrid method and leveraging an LLM with retrieval augmented generation technology, we developed an interactive yield prediction Web tool that is user-friendly and supports sustainable data updates. This tool integrates multi-source data to assist breeding decision-making. This study aims to accelerate the identification of high-yield materials in the breeding process, enhance breeding efficiency, and enable more scientific and smart breeding decisions.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.04487

Country:

North America > United States (1.00)
Europe (1.00)
Asia > China > Zhejiang Province (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (1.00)
Food & Agriculture > Agriculture (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GDSR: Global-Detail Integration through Dual-Branch Network with Wavelet Losses for Remote Sensing Image Super-Resolution

Zhu, Qiwei, Li, Kai, Zhang, Guojing, Wang, Xiaoying, Huang, Jianqiang, Li, Xilai

arXiv.org Artificial IntelligenceJan-7-2025

In recent years, deep neural networks, including Convolutional Neural Networks, Transformers, and State Space Models, have achieved significant progress in Remote Sensing Image (RSI) Super-Resolution (SR). However, existing SR methods typically overlook the complementary relationship between global and local dependencies. These methods either focus on capturing local information or prioritize global information, which results in models that are unable to effectively capture both global and local features simultaneously. Moreover, their computational cost becomes prohibitive when applied to large-scale RSIs. To address these challenges, we introduce the novel application of Receptance Weighted Key Value (RWKV) to RSI-SR, which captures long-range dependencies with linear complexity. To simultaneously model global and local features, we propose the Global-Detail dual-branch structure, GDSR, which performs SR reconstruction by paralleling RWKV and convolutional operations to handle large-scale RSIs. Furthermore, we introduce the Global-Detail Reconstruction Module (GDRM) as an intermediary between the two branches to bridge their complementary roles. In addition, we propose Wavelet Loss, a loss function that effectively captures high-frequency detail information in images, thereby enhancing the visual quality of SR, particularly in terms of detail reconstruction. Extensive experiments on several benchmarks, including AID, AID_CDM, RSSRD-QH, and RSSRD-QH_CDM, demonstrate that GSDR outperforms the state-of-the-art Transformer-based method HAT by an average of 0.05 dB in PSNR, while using only 63% of its parameters and 51% of its FLOPs, achieving an inference speed 2.9 times faster. Furthermore, the Wavelet Loss shows excellent generalization across various architectures, providing a novel perspective for RSI-SR enhancement.

artificial intelligence, machine learning, remote sensing image super-resolution, (5 more...)

arXiv.org Artificial Intelligence

2501.0146

Genre: Research Report (0.66)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.60)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback