AITopics | Geophysical Analysis & Survey

Collaborating Authors

Geophysical Analysis & Survey

Deep learning waterways for rural infrastructure development

arXiv.org Artificial IntelligenceNov-18-2024

Surprisingly a number of Earth's waterways remain unmapped, with a significant number in low and middle income countries. Here we build a computer vision model (WaterNet) to learn the location of waterways in the United States, based on high resolution satellite imagery and digital elevation models, and then deploy this in novel environments in the African continent. Our outputs provide detail of waterways structures hereto unmapped. When assessed against community needs requests for rural bridge building related to access to schools, health care facilities and agricultural markets, we find these newly generated waterways capture on average 93% (country range: 88-96%) of these requests whereas Open Street Map, and the state of the art data from TDX-Hydro, capture only 36% (5-72%) and 62% (37% - 85%), respectively. Because these new machine learning enabled maps are built on public and operational data acquisition this approach offers promise for capturing humanitarian needs and planning for social development in places where cartographic efforts have so far failed to deliver. The improved performance in identifying community needs missed by existing data suggests significant value for rural infrastructure development and better targeting of development interventions.

artificial intelligence, machine learning, waterway, (17 more...)

arXiv.org Artificial Intelligence

2411.1359

Country:

Africa > Ethiopia (0.05)
Africa > Rwanda (0.05)
Africa > Côte d'Ivoire (0.05)
(16 more...)

Genre: Research Report (0.64)

Industry:

Social Sector (0.66)
Education (0.54)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Scaling Deep Learning Research with Kubernetes on the NRP Nautilus HyperCluster

Hurt, J. Alex, Ouadou, Anes, Alshehri, Mariam, Scott, Grant J.

arXiv.org Artificial IntelligenceNov-18-2024

Throughout the scientific computing space, deep learning algorithms have shown excellent performance in a wide range of applications. As these deep neural networks (DNNs) continue to mature, the necessary compute required to train them has continued to grow. Today, modern DNNs require millions of FLOPs and days to weeks of training to generate a well-trained model. The training times required for DNNs are oftentimes a bottleneck in DNN research for a variety of deep learning applications, and as such, accelerating and scaling DNN training enables more robust and accelerated research. To that end, in this work, we explore utilizing the NRP Nautilus HyperCluster to automate and scale deep learning model training for three separate applications of DNNs, including overhead object detection, burned area segmentation, and deforestation detection. In total, 234 deep neural models are trained on Nautilus, for a total time of 4,040 hours. Deep convolutional neural networks (DCNNs) have been established as the state of the art in computer vision (CV) and have shown superior performance in visual tasks for many domains, including remote sensing. With billions of pixels being collected by overhead sources like satellites, remote sensing (RS) is becoming evermore a big-data problem domain, with endless amounts of data available to enable CV applications. Due in part to this data availability, the training and optimization of deep networks for RS applications has been explored to great lengths in recent years. In 2017, researchers investigated utilizing DCNNs for land-cover classification in overhead imagery along with techniques such as transfer learning and data augmentation[1]. This work was then extended into multi-network fusion research, where multiple DCNNs trained on overhead satellite imagery were fused using simple fusion techniques such as voting and arrogance [2] and then compared to more complex fusion algorithms such as the Choquet and Sugeno Fuzzy Integral [3], [4]. While these studies explored utilizing DCNNs to perform classification on overhead RS imagery, further exploration was required in broad area search, in which DCNNs are trained and used not on clean pre-processed datasets, but instead applied to large swaths of overhead imagery with the goal of finding all instances of a given object or terrain.

application, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.12038

Country:

Pacific Ocean > North Pacific Ocean > South China Sea (0.04)
North America > United States > Missouri > Boone County > Columbia (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.76)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large Vision-Language Models for Remote Sensing Visual Question Answering

Siripong, Surasakdi, Chaiyapan, Apirak, Phonchai, Thanakorn

arXiv.org Artificial IntelligenceNov-16-2024

Remote Sensing Visual Question Answering (RSVQA) is a challenging task that involves interpreting complex satellite imagery to answer natural language questions. Traditional approaches often rely on separate visual feature extractors and language processing models, which can be computationally intensive and limited in their ability to handle open-ended questions. In this paper, we propose a novel method that leverages a generative Large Vision-Language Model (LVLM) to streamline the RSVQA process. Our approach consists of a two-step training strategy: domain-adaptive pretraining and prompt-based finetuning. This method enables the LVLM to generate natural language answers by conditioning on both visual and textual inputs, without the need for predefined answer categories. We evaluate our model on the RSVQAxBEN dataset, demonstrating superior performance compared to state-of-the-art baselines. Additionally, a human evaluation study shows that our method produces answers that are more accurate, relevant, and fluent. The results highlight the potential of generative LVLMs in advancing the field of remote sensing analysis.

large language model, lvlm, question answering, (15 more...)

arXiv.org Artificial Intelligence

2411.10857

Country:

Europe > Austria > Vienna (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.72)

Add feedback

Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery

Dahal, Ashim, Murad, Saydul Akbar, Rahimi, Nick

arXiv.org Artificial IntelligenceNov-13-2024

Vision Transformers (ViT) have recently brought a new wave of research in the field of computer vision. These models have done particularly well in the field of image classification and segmentation. Research on semantic and instance segmentation has emerged to accelerate with the inception of the new architecture, with over 80\% of the top 20 benchmarks for the iSAID dataset being either based on the ViT architecture or the attention mechanism behind its success. This paper focuses on the heuristic comparison of three key factors of using (or not using) ViT for semantic segmentation of remote sensing aerial images on the iSAID. The experimental results observed during the course of the research were under the scrutinization of the following objectives: 1. Use of weighted fused loss function for the maximum mean Intersection over Union (mIoU) score, Dice score, and minimization or conservation of entropy or class representation, 2. Comparison of transfer learning on Meta's MaskFormer, a ViT-based semantic segmentation model, against generic UNet Convolutional Neural Networks (CNNs) judged over mIoU, Dice scores, training efficiency, and inference time, and 3. What do we lose for what we gain? i.e., the comparison of the two models against current state-of-art segmentation models. We show the use of the novel combined weighted loss function significantly boosts the CNN model's performance capacities as compared to transfer learning the ViT. The code for this implementation can be found on \url{https://github.com/ashimdahal/ViT-vs-CNN-ImageSegmentation}.

dataset, loss function, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2411.09101

Country:

North America > United States > Mississippi > Forrest County > Hattiesburg (0.04)
North America > United States > Alabama (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.72)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG

Zhang, Zilun, Shen, Haozhan, Zhao, Tiancheng, Wang, Yuhao, Chen, Bin, Cai, Yuxiang, Shang, Yongheng, Yin, Jianwei

arXiv.org Artificial IntelligenceNov-12-2024

Ultra High Resolution (UHR) remote sensing imagery (RSI) (e.g. 100,000 $\times$ 100,000 pixels or more) poses a significant challenge for current Remote Sensing Multimodal Large Language Models (RSMLLMs). If choose to resize the UHR image to standard input image size, the extensive spatial and contextual information that UHR images contain will be neglected. Otherwise, the original size of these images often exceeds the token limits of standard RSMLLMs, making it difficult to process the entire image and capture long-range dependencies to answer the query based on the abundant visual context. In this paper, we introduce ImageRAG for RS, a training-free framework to address the complexities of analyzing UHR remote sensing imagery. By transforming UHR remote sensing image analysis task to image's long context selection task, we design an innovative image contextual retrieval mechanism based on the Retrieval-Augmented Generation (RAG) technique, denoted as ImageRAG. ImageRAG's core innovation lies in its ability to selectively retrieve and focus on the most relevant portions of the UHR image as visual contexts that pertain to a given query. Fast path and slow path are proposed in this framework to handle this task efficiently and effectively. ImageRAG allows RSMLLMs to manage extensive context and spatial information from UHR RSI, ensuring the analysis is both accurate and efficient.

arxiv, imagerag, zhang, (17 more...)

arXiv.org Artificial Intelligence

2411.07688

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.54)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Aquila-plus: Prompt-Driven Visual-Language Models for Pixel-Level Remote Sensing Image Understanding

Lu, Kaixuan

arXiv.org Artificial IntelligenceNov-9-2024

The recent development of vision language models (VLMs) has led to significant advances in visual-language integration through visual instruction tuning, and they have rapidly evolved in the field of remote sensing image understanding, demonstrating their powerful capabilities. However, existing RSVLMs mainly focus on image-level or frame-level understanding, making it difficult to achieve fine-grained pixel-level visual-language alignment. Additionally, the lack of mask-based instructional data limits their further development. In this paper, we propose a mask-text instruction tuning method called Aquila-plus, which extends the capabilities of RSVLMs to achieve pixel-level visual understanding by incorporating fine-grained mask regions into language instructions. To achieve this, we first meticulously constructed a mask region-text dataset containing 100K samples, and then designed a visual-language model by injecting pixel-level representations into a large language model (LLM). Specifically, Aquila-plus uses a convolutional CLIP as the visual encoder and employs a mask-aware visual extractor to extract precise visual mask features from high-resolution inputs. Experimental results demonstrate that Aquila-plus outperforms existing methods in various region understanding tasks, showcasing its novel capabilities in pixel-level instruction tuning.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.06142

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Aquila: A Hierarchically Aligned Visual-Language Model for Enhanced Remote Sensing Image Comprehension

Lu, Kaixuan, Zhang, Ruiqian, Huang, Xiao, Xie, Yuxing

arXiv.org Artificial IntelligenceNov-9-2024

Recently, large vision language models (VLMs) have made significant strides in visual language capabilities through visual instruction tuning, showing great promise in the field of remote sensing image interpretation. However, existing remote sensing vision language models (RSVLMs) often fall short in capturing the complex characteristics of remote sensing scenes, as they typically rely on low resolution, single scale visual features and simplistic methods to map visual features to language features. In this paper, we present Aquila, an advanced visual language foundation model designed to enable richer visual feature representation and more precise visual-language feature alignment for remote sensing images. Our approach introduces a learnable Hierarchical Spatial Feature Integration (SFI) module that supports high resolution image inputs and aggregates multi scale visual features, allowing for the detailed representation of complex visual information. Additionally, the SFI module is repeatedly integrated into the layers of the large language model (LLM) to achieve deep visual language feature alignment, without compromising the model's performance in natural language processing tasks. These innovations, capturing detailed visual effects through higher resolution and multi scale input, and enhancing feature alignment significantly improve the model's ability to learn from image text data. We validate the effectiveness of Aquila through extensive quantitative experiments and qualitative analyses, demonstrating its superior performance.

aquila, arxiv preprint arxiv, language model, (14 more...)

arXiv.org Artificial Intelligence

2411.06074

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre:

Research Report > Promising Solution (0.46)
Research Report > New Finding (0.46)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

STARS: Sensor-agnostic Transformer Architecture for Remote Sensing

King, Ethan, Rodriguez, Jaime, Llanes, Diego, Doster, Timothy, Emerson, Tegan, Koch, James

arXiv.org Artificial IntelligenceNov-8-2024

We present a sensor-agnostic spectral transformer as the basis for spectral foundation models. To that end, we introduce a Universal Spectral Representation (USR) that leverages sensor meta-data, such as sensing kernel specifications and sensing wavelengths, to encode spectra obtained from any spectral instrument into a common representation, such that a single model can ingest data from any sensor. Furthermore, we develop a methodology for pre-training such models in a self-supervised manner using a novel random sensor-augmentation and reconstruction pipeline to learn spectral features independent of the sensing paradigm. We demonstrate that our architecture can learn sensor independent spectral features that generalize effectively to sensors not seen during training. This work sets the stage for training foundation models that can both leverage and be effective for the growing diversity of spectral data.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.05714

Country:

North America > United States > Washington > Benton County > Richland (0.04)
North America > United States > Texas > Harris County > Houston (0.04)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

A Nerf-Based Color Consistency Method for Remote Sensing Images

Zuo, Zongcheng, Li, Yuanxiang, Zhang, Tongtong

arXiv.org Artificial IntelligenceNov-8-2024

Due to different seasons, illumination, and atmospheric conditions, the photometric of the acquired image varies greatly, which leads to obvious stitching seams at the edges of the mosaic image. Traditional methods can be divided into two categories, one is absolute radiation correction and the other is relative radiation normalization. We propose a NeRF-based method of color consistency correction for multi-view images, which weaves image features together using implicit expressions, and then re-illuminates feature space to generate a fusion image with a new perspective. We chose Superview-1 satellite images and UAV images with large range and time difference for the experiment. Experimental results show that the synthesize image generated by our method has excellent visual effect and smooth color transition at the edges.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2411.05557

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report (0.70)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.51)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Uncertainty Prediction Neural Network (UpNet): Embedding Artificial Neural Network in Bayesian Inversion Framework to Quantify the Uncertainty of Remote Sensing Retrieval

Fan, Dasheng, Mu, Xihan, Lai, Yongkang, Xie, Donghui, Yan, Guangjian

arXiv.org Artificial IntelligenceNov-7-2024

For the retrieval of large-scale vegetation biophysical parameters, the inversion of radiative transfer models (RTMs) is the most commonly used approach. In recent years, Artificial Neural Network (ANN)-based methods have become the mainstream for inverting RTMs due to their high accuracy and computational efficiency. It has been widely used in the retrieval of biophysical variables (BV). However, due to the lack of the Bayesian inversion theory interpretation, it faces challenges in quantifying the retrieval uncertainty, a crucial metric for product quality validation and downstream applications such as data assimilation or ecosystem carbon cycling modeling. This study proved that the ANN trained with squared loss outputs the posterior mean, providing a rigorous foundation for its uncertainty quantification, regularization, and incorporation of prior information. A Bayesian theoretical framework was subsequently proposed for ANN-based methods. Using this framework, we derived a new algorithm called Uncertainty Prediction Neural Network (UpNet), which enables the simultaneous training of two ANNs to retrieve BV and provide retrieval uncertainty. To validate our method, we compared UpNet with the standard Bayesian inference method, i.e., Markov Chain Monte Carlo (MCMC), in the inversion of a widely used RTM called ProSAIL for retrieving BVs and estimating uncertainty. The results demonstrated that the BVs retrieved and the uncertainties estimated by UpNet were highly consistent with those from MCMC, achieving over a million-fold acceleration. These results indicated that UpNet has significant potential for fast retrieval and uncertainty quantification of BVs or other parameters with medium and high-resolution remote sensing data. Our Python implementation is available at: https://github.com/Dash-RSer/UpNet.

mcmc, reflectance, retrieval uncertainty, (16 more...)

arXiv.org Artificial Intelligence

2411.04556

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York (0.04)
Europe (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.65)
Food & Agriculture > Agriculture (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback