AITopics | Geophysical Analysis & Survey

Collaborating Authors

Geophysical Analysis & Survey

From underwater to aerial: a novel multi-scale knowledge distillation approach for coral reef monitoring

Contini, Matteo, Illien, Victor, Barde, Julien, Poulain, Sylvain, Bernard, Serge, Joly, Alexis, Bonhommeau, Sylvain

arXiv.org Artificial IntelligenceFeb-25-2025

Drone-based remote sensing combined with AI-driven methodologies has shown great potential for accurate mapping and monitoring of coral reef ecosystems. This study presents a novel multi-scale approach to coral reef monitoring, integrating fine-scale underwater imagery with medium-scale aerial imagery. Underwater images are captured using an Autonomous Surface Vehicle (ASV), while aerial images are acquired with an aerial drone. A transformer-based deep-learning model is trained on underwater images to detect the presence of 31 classes covering various coral morphotypes, associated fauna, and habitats. These predictions serve as annotations for training a second model applied to aerial images. The transfer of information across scales is achieved through a weighted footprint method that accounts for partial overlaps between underwater image footprints and aerial image tiles. The results show that the multi-scale methodology successfully extends fine-scale classification to larger reef areas, achieving a high degree of accuracy in predicting coral morphotypes and associated habitats. The method showed a strong alignment between underwater-derived annotations and ground truth data, reflected by an AUC (Area Under the Curve) score of 0.9251. This shows that the integration of underwater and aerial imagery, supported by deep-learning models, can facilitate scalable and accurate reef assessments. This study demonstrates the potential of combining multi-scale imaging and AI to facilitate the monitoring and conservation of coral reefs. Our approach leverages the strengths of underwater and aerial imagery, ensuring the precision of fine-scale analysis while extending it to cover a broader reef area.

annotation, orthophoto, underwater image, (16 more...)

arXiv.org Artificial Intelligence

2502.17883

Country:

Africa > La Réunion (0.15)
Europe > France > Occitanie > Hérault > Montpellier (0.05)
North America > Canada > Quebec (0.04)
(7 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Food & Agriculture (0.46)
Information Technology (0.46)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Convolutional neural networks for mineral prospecting through alteration mapping with remote sensing data

Farahbakhsh, Ehsan, Goel, Dakshi, Pimparkar, Dhiraj, Muller, R. Dietmar, Chandra, Rohitash

arXiv.org Artificial IntelligenceFeb-24-2025

Traditional geological mapping methods, which rely on field observations and rock sample analysis, are ine fficient for continuous spatial mapping of geological features such as alteration zones. Deep learning models such as convolutional neural networks (CNNs) have ushered in a transformative era in remote sensing data analysis. CNNs excel in automatically extracting features from image data for classification and regression problems. CNNs have the ability to pinpoint specific mineralogical changes attributed to mineralisation processes by discerning subtle features within remote sensing data. Our methodology involves model training using two distinct sets of training samples generated through ground truth data and a fully automated approach through selective principal component analysis (PCA). We also compare CNNs with conventional machine learning models, including k-nearest neighbours, support vector machines, and multilayer perceptron. Our findings indicate that training with a ground truth-based dataset produces more reliable alteration maps. Additionally, we find that CNNs perform slightly better when compared to conventional machine learning models, which further demonstrates the ability of CNNs to capture spatial patterns in remote sensing data e ffectively. We find that Landsat 9 surpasses Landsat 8 in mapping iron oxide areas when employing the CNNs model trained with ground truth data obtained by field surveys. We also observe that using ASTER data with the CNNs model trained on the ground truth-based dataset produces the most accurate maps for two other important types of alteration zones, argillic and propylitic. This underscores the utility of CNNs in enhancing the e fficiency and precision of geological mapping, particularly in discerning subtle alterations indicative of mineralisation processes, especially those associated with critical metal resources. Introduction Geological maps are traditionally crafted through ground surveys and founded on field observations. They frequently incur inevitable errors due to the lack of spatial continuity of the field observations, thus yielding inaccurate representations (Campbell et al., 2005). Recognising these limitations, geologists have been prompted to seek innovative approaches and e fficient methodologies to accurately map geological features, particularly alteration zones (Kesler, 2007; McCuaig et al., 2010). The utilisation of remote sensing data for alteration mapping emerges as a pivotal technique in regional mineral exploration, enabling the precise spatial identification of alteration zones associated with mineralisation processes (Mohamed et al., 2021).

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Artificial Intelligence

2502.18533

Country:

North America > United States (0.68)
Oceania > Australia > New South Wales (0.14)
Europe (0.14)
Asia > India (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Materials > Metals & Mining (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Good Representation, Better Explanation: Role of Convolutional Neural Networks in Transformer-Based Remote Sensing Image Captioning

Das, Swadhin, Gupta, Saarthak, Kumar, and Kamal, Sharma, Raksha

arXiv.org Artificial IntelligenceFeb-22-2025

Remote Sensing Image Captioning (RSIC) is the process of generating meaningful descriptions from remote sensing images. Recently, it has gained significant attention, with encoder-decoder models serving as the backbone for generating meaningful captions. The encoder extracts essential visual features from the input image, transforming them into a compact representation, while the decoder utilizes this representation to generate coherent textual descriptions. Recently, transformer-based models have gained significant popularity due to their ability to capture long-range dependencies and contextual information. The decoder has been well explored for text generation, whereas the encoder remains relatively unexplored. However, optimizing the encoder is crucial as it directly influences the richness of extracted features, which in turn affects the quality of generated captions. To address this gap, we systematically evaluate twelve different convolutional neural network (CNN) architectures within a transformer-based encoder framework to assess their effectiveness in RSIC. The evaluation consists of two stages: first, a numerical analysis categorizes CNNs into different clusters, based on their performance. The best performing CNNs are then subjected to human evaluation from a human-centric perspective by a human annotator. Additionally, we analyze the impact of different search strategies, namely greedy search and beam search, to ensure the best caption. The results highlight the critical role of encoder selection in improving captioning performance, demonstrating that specific CNN architectures significantly enhance the quality of generated descriptions for remote sensing images. Introduction With the advancement of remote sensing technologies and machine learning-based methods, the demand for Remote Sensing Image Captioning (RSIC) [1, 2] is growing rapidly. It plays a crucial role in various fields, including environmental monitoring, urban planning, and disaster management, by providing automated textual descriptions of satellite images.

caption, dataset, encoder, (13 more...)

arXiv.org Artificial Intelligence

2502.16095

Country:

Asia > India > Uttarakhand > Roorkee (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CARE: Confidence-Aware Regression Estimation of building density fine-tuning EO Foundation Models

Dionelis, Nikolaos, Bosmans, Jente, Longépé, Nicolas

arXiv.org Artificial IntelligenceFeb-19-2025

--Performing accurate confidence quantification and assessment in pixel-wise regression tasks, which are downstream applications of AI Foundation Models for Earth Observation (EO), is important for deep neural networks to predict their failures, improve their performance and enhance their capabilities in real-world applications, for their practical deployment. For pixel-wise regression tasks, specifically utilizing remote sensing data from satellite imagery in EO Foundation Models, confidence quantification is a critical challenge. The focus of this research is on developing a Foundation Model using EO satellite data that computes and assigns a confidence metric alongside regression outputs to improve the reliability and interpretability of predictions generated by deep neural networks. T o this end, we develop, train and evaluate the proposed Confidence-A ware Regression Estimation (CARE) Foundation Model. Our model CARE computes and assigns confidence to regression results as downstream tasks of a Foundation Model for EO data, and performs a confidence-aware self-corrective learning method for the low-confidence regions. We evaluate the model CARE, and experimental results on multi-spectral data from the Copernicus Sentinel-2 constellation to estimate the building density (i.e. We also show that our model CARE outperforms other methods. The significance of confidence quantification and assessment in deep learning, specifically in AI Foundation Models in Earth Observation (EO) that use satellite data, for regression applications is critical. The utility of satellite data seems inexhaustible, and thanks to developments in AI, applications emerge at an accelerated pace in EO Foundation Models using remote sensing data.

confidence metric, foundation model, neural network, (13 more...)

arXiv.org Artificial Intelligence

2502.13734

Country:

North America (0.14)
South America (0.04)
Europe > Switzerland (0.04)
(10 more...)

Genre: Research Report (0.64)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Remote Sensing Semantic Segmentation Quality Assessment based on Vision Language Model

Shi, Huiying, Tan, Zhihong, Zhang, Zhihan, Wei, Hongchen, Hu, Yaosi, Zhang, Yingxue, Chen, Zhenzhong

arXiv.org Artificial IntelligenceFeb-18-2025

The complexity of scenes and variations in image quality result in significant variability in the performance of semantic segmentation methods of remote sensing imagery (RSI) in supervised real-world scenarios. This makes the evaluation of semantic segmentation quality in such scenarios an issue to be resolved. However, most of the existing evaluation metrics are developed based on expert-labeled object-level annotations, which are not applicable in such scenarios. To address this issue, we propose RS-SQA, an unsupervised quality assessment model for RSI semantic segmentation based on vision language model (VLM). This framework leverages a pre-trained RS VLM for semantic understanding and utilizes intermediate features from segmentation methods to extract implicit information about segmentation quality. Specifically, we introduce CLIP-RS, a large-scale pre-trained VLM trained with purified text to reduce textual noise and capture robust semantic information in the RS domain. Feature visualizations confirm that CLIP-RS can effectively differentiate between various levels of segmentation quality. Semantic features and low-level segmentation features are effectively integrated through a semantic-guided approach to enhance evaluation accuracy. To further support the development of RS semantic segmentation quality assessment, we present RS-SQED, a dedicated dataset sampled from four major RS semantic segmentation datasets and annotated with segmentation accuracy derived from the inference results of 8 representative segmentation methods. Experimental results on the established dataset demonstrate that RS-SQA significantly outperforms state-of-the-art quality assessment models. This provides essential support for predicting segmentation accuracy and high-quality semantic segmentation interpretation, offering substantial practical value.

dataset, remote sensing, segmentation, (13 more...)

arXiv.org Artificial Intelligence

2502.1399

Country:

Europe > Germany > Brandenburg > Potsdam (0.05)
Europe > Switzerland (0.04)
Europe > France (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)

Add feedback

JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework

Liu, Ziyuan, Zhu, Ruifei, Gao, Long, Zhou, Yuanxiu, Ma, Jingyu, Gu, Yuantao

arXiv.org Artificial IntelligenceFeb-18-2025

Xu et al. [24] introduce a semi-supervised label and embedding consistency network (SS-LEC) for ORSI scene classification, which strategically enforces consistency across augmentations and stages of training. Li et al. [25] propose SemiCD-VL, a VLM-guided semi-supervised change detection method that synthesizes pseudo labels via a mixed change event generation strategy, achieving significant performance gains over FixMatch and SOT A unsupervised methods. However, DL-based CD methods generally face two major challenges: the scarcity of high-quality, high-resolution, all-inclusive CD datasets and limitations in handling highly dynamic change areas. Although numerous CD datasets have been constructed and proposed, they are often tailored to specific scenarios, which restricts the generalization capabilities of the algorithms. For instance, models trained on datasets focused on human-induced changes often fail to perform effectively when confronted with natural change scenarios.

change detection, dataset, remote sensing, (13 more...)

arXiv.org Artificial Intelligence

2502.13407

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.58)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation

Ma, Zhiming, Xiao, Xiayang, Dong, Sihao, Wang, Peidong, Wang, HaiPeng, Pan, Qingyun

arXiv.org Artificial IntelligenceFeb-17-2025

As a powerful all-weather Earth observation tool, synthetic aperture radar (SAR) remote sensing enables critical military reconnaissance, maritime surveillance, and infrastructure monitoring. Although Vision language models (VLMs) have made remarkable progress in natural language processing and image understanding, their applications remain limited in professional domains due to insufficient domain expertise. This paper innovatively proposes the first large-scale multimodal dialogue dataset for SAR images, named SARChat-2M, which contains approximately 2 million high-quality image-text pairs, encompasses diverse scenarios with detailed target annotations. This dataset not only supports several key tasks such as visual understanding and object detection tasks, but also has unique innovative aspects: this study develop a visual-language dataset and benchmark for the SAR domain, enabling and evaluating VLMs' capabilities in SAR image interpretation, which provides a paradigmatic framework for constructing multimodal datasets across various remote sensing vertical domains. Through experiments on 16 mainstream VLMs, the effectiveness of the dataset has been fully verified. The project will be released at https://github.com/JimmyMa99/SARChat.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.08168

Country: Asia > China (0.68)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.57)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(2 more...)

Add feedback

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Mall, Utkarsh, Phoo, Cheng Perng, Chiquier, Mia, Hariharan, Bharath, Bala, Kavita, Vondrick, Carl

arXiv.org Artificial IntelligenceFeb-14-2025

Visual data is used in numerous different scientific workflows ranging from remote sensing to ecology. As the amount of observation data increases, the challenge is not just to make accurate predictions but also to understand the underlying mechanisms for those predictions. Good interpretation is important in scientific workflows, as it allows for better decision-making by providing insights into the data. This paper introduces an automatic way of obtaining such interpretable-by-design models, by learning programs that interleave neural networks. We propose DiSciPLE (Discovering Scientific Programs using LLMs and Evolution) an evolutionary algorithm that leverages common sense and prior knowledge of large language models (LLMs) to create Python programs explaining visual data. Additionally, we propose two improvements: a program critic and a program simplifier to improve our method further to synthesize good programs. On three different real-world problems, DiSciPLE learns state-of-the-art programs on novel tasks with no prior literature. For example, we can learn programs with 35% lower error than the closest non-interpretable baseline for population density estimation.

evolutionary algorithm, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.1006

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (0.68)
Education (0.66)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.89)

Add feedback

FE-LWS: Refined Image-Text Representations via Decoder Stacking and Fused Encodings for Remote Sensing Image Captioning

Das, Swadhin, Sharma, Raksha

arXiv.org Artificial IntelligenceFeb-13-2025

Remote sensing image captioning aims to generate descriptive text from remote sensing images, typically employing an encoder-decoder framework. In this setup, a convolutional neural network (CNN) extracts feature representations from the input image, which then guide the decoder in a sequence-to-sequence caption generation process. Although much research has focused on refining the decoder, the quality of image representations from the encoder remains crucial for accurate captioning. This paper introduces a novel approach that integrates features from two distinct CNN based encoders, capturing complementary information to enhance caption generation. Additionally, we propose a weighted averaging technique to combine the outputs of all GRUs in the stacked decoder. Furthermore, a comparison-based beam search strategy is incorporated to refine caption selection. The results demonstrate that our fusion-based approach, along with the enhanced stacked decoder, significantly outperforms both the transformer-based state-of-the-art model and other LSTM-based baselines.

artificial intelligence, deep learning, machine learning, (4 more...)

arXiv.org Artificial Intelligence

2502.09282

Genre: Research Report > Promising Solution (0.53)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.80)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection

Yu, Yi, Yang, Xue, Li, Yansheng, Han, Zhenjun, Da, Feipeng, Yan, Junchi

arXiv.org Artificial IntelligenceFeb-13-2025

Accurately estimating the orientation of visual objects with compact rotated bounding boxes (RBoxes) has become a prominent demand, which challenges existing object detection paradigms that only use horizontal bounding boxes (HBoxes). To equip the detectors with orientation awareness, supervised regression/classification modules have been introduced at the high cost of rotation annotation. Meanwhile, some existing datasets with oriented objects are already annotated with horizontal boxes or even single points. It becomes attractive yet remains open for effectively utilizing weaker single point and horizontal annotations to train an oriented object detector (OOD). We develop Wholly-WOOD, a weakly-supervised OOD framework, capable of wholly leveraging various labeling forms (Points, HBoxes, RBoxes, and their combination) in a unified fashion. By only using HBox for training, our Wholly-WOOD achieves performance very close to that of the RBox-trained counterpart on remote sensing and other areas, significantly reducing the tedious efforts on labor-intensive annotation for oriented objects. The source codes are available at https://github.com/VisionXLab/whollywood (PyTorch-based) and https://github.com/VisionXLab/whollywood-jittor (Jittor-based).

artificial intelligence, detection, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.09471

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports (0.67)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback