Geophysical Analysis & Survey
A Robust and Low Complexity Deep Learning Model for Remote Sensing Image Classification
Le, Cam, Pham, Lam, NVN, Nghia, Nguyen, Truong, Trang, Le Hong
In this paper, we present a robust and low complexity deep learning model for Remote Sensing Image Classification (RSIC), the task of identifying the scene of a remote sensing image. In particular, we firstly evaluate different low complexity and benchmark deep neural networks: MobileNetV1, MobileNetV2, NASNetMobile, and EfficientNetB0, which present the number of trainable parameters lower than 5 Million (M). After indicating best network architecture, we further improve the network performance by applying attention schemes to multiple feature maps extracted from middle layers of the network. To deal with the issue of increasing the model footprint as using attention schemes, we apply the quantization technique to satisfy the maximum of 20 MB memory occupation. By conducting extensive experiments on the benchmark datasets NWPU-RESISC45, we achieve a robust and low-complexity model, which is very competitive to the state-of-the-art systems and potential for real-life applications on edge devices.
Scale-Semantic Joint Decoupling Network for Image-text Retrieval in Remote Sensing
Zheng, Chengyu, song, Ning, Zhang, Ruoyu, Huang, Lei, Wei, Zhiqiang, Nie, Jie
Image-text retrieval in remote sensing aims to provide flexible information for data analysis and application. In recent years, state-of-the-art methods are dedicated to ``scale decoupling'' and ``semantic decoupling'' strategies to further enhance the capability of representation. However, these previous approaches focus on either the disentangling scale or semantics but ignore merging these two ideas in a union model, which extremely limits the performance of cross-modal retrieval models. To address these issues, we propose a novel Scale-Semantic Joint Decoupling Network (SSJDN) for remote sensing image-text retrieval. Specifically, we design the Bidirectional Scale Decoupling (BSD) module, which exploits Salience Feature Extraction (SFE) and Salience-Guided Suppression (SGS) units to adaptively extract potential features and suppress cumbersome features at other scales in a bidirectional pattern to yield different scale clues. Besides, we design the Label-supervised Semantic Decoupling (LSD) module by leveraging the category semantic labels as prior knowledge to supervise images and texts probing significant semantic-related information. Finally, we design a Semantic-guided Triple Loss (STL), which adaptively generates a constant to adjust the loss function to improve the probability of matching the same semantic image and text and shorten the convergence time of the retrieval model. Our proposed SSJDN outperforms state-of-the-art approaches in numerical experiments conducted on four benchmark remote sensing datasets.
Country-wide Retrieval of Forest Structure From Optical and SAR Satellite Imagery With Deep Ensembles
Becker, Alexander, Russo, Stefania, Puliti, Stefano, Lang, Nico, Schindler, Konrad, Wegner, Jan Dirk
Monitoring and managing Earth's forests in an informed manner is an important requirement for addressing challenges like biodiversity loss and climate change. While traditional in situ or aerial campaigns for forest assessments provide accurate data for analysis at regional level, scaling them to entire countries and beyond with high temporal resolution is hardly possible. In this work, we propose a method based on deep ensembles that densely estimates forest structure variables at country-scale with 10-meter resolution, using freely available satellite imagery as input. Our method jointly transforms Sentinel-2 optical images and Sentinel-1 synthetic-aperture radar images into maps of five different forest structure variables: 95th height percentile, mean height, density, Gini coefficient, and fractional cover. We train and test our model on reference data from 41 airborne laser scanning missions across Norway and demonstrate that it is able to generalize to unseen test regions, achieving normalized mean absolute errors between 11% and 15%, depending on the variable. Our work is also the first to propose a variant of so-called Bayesian deep learning to densely predict multiple forest structure variables with well-calibrated uncertainty estimates from satellite imagery. The uncertainty information increases the trustworthiness of the model and its suitability for downstream tasks that require reliable confidence estimates as a basis for decision making. We present an extensive set of experiments to validate the accuracy of the predicted maps as well as the quality of the predicted uncertainties. To demonstrate scalability, we provide Norway-wide maps for the five forest structure variables.
LSVL: Large-scale season-invariant visual localization for UAVs
Kinnari, Jouko, Renzulli, Riccardo, Verdoja, Francesco, Kyrki, Ville
Localization of autonomous unmanned aerial vehicles (UAVs) relies heavily on Global Navigation Satellite Systems (GNSS), which are susceptible to interference. Especially in security applications, robust localization algorithms independent of GNSS are needed to provide dependable operations of autonomous UAVs also in interfered conditions. Typical non-GNSS visual localization approaches rely on known starting pose, work only on a small-sized map, or require known flight paths before a mission starts. We consider the problem of localization with no information on initial pose or planned flight path. We propose a solution for global visual localization on a map at scale up to 100 km2, based on matching orthoprojected UAV images to satellite imagery using learned season-invariant descriptors. We show that the method is able to determine heading, latitude and longitude of the UAV at 12.6-18.7 m lateral translation error in as few as 23.2-44.4 updates from an uninformed initialization, also in situations of significant seasonal appearance difference (winter-summer) between the UAV image and the map. We evaluate the characteristics of multiple neural network architectures for generating the descriptors, and likelihood estimation methods that are able to provide fast convergence and low localization error. We also evaluate the operation of the algorithm using real UAV data and evaluate running time on a real-time embedded platform. We believe this is the first work that is able to recover the pose of an UAV at this scale and rate of convergence, while allowing significant seasonal difference between camera observations and map.
MapInWild: A Remote Sensing Dataset to Address the Question What Makes Nature Wild
Ekim, Burak, Stomberg, Timo T., Roscher, Ribana, Schmitt, Michael
I. INTRODUCTION The advancement in deep learning (DL) techniques has led to a notable increase in the number and size of annotated datasets in a variety of domains, with remote sensing (RS) being no exception [1]. Also, an increase in earth observation (EO) missions and easy access to globally available and free geodata have opened up new research opportunities. Although numerous RS datasets have been published in the past years [2]-[6], most of them addressed tasks concerning man-made environments such as building footprint extraction and road network classification, leaving the environmental and ecology-related sub-areas of remote sensing underrepresented. The ESA WorldCover map legend is given below the figure. In this community, the classification task can be machine learning model in the form of deep neural networks. While some methods frame the RS-related classification (usually called semantic segmentation by tasks within the context of perturbation-seeking generative the computer vision community) the task outputs denselyannotated adversarial networks [14], some others made use of uncertainty prediction maps on a pixel scale by separating the estimation applied to deep ensembles [15] and self-attention input into distinct and semantically coherent segments.
Land Use Prediction using Electro-Optical to SAR Few-Shot Transfer Learning
Hussing, Marcel, Li, Karen, Eaton, Eric
Satellite image analysis has important implications for land use, urbanization, and ecosystem monitoring. Deep learning methods can facilitate the analysis of different satellite modalities, such as electro-optical (EO) and synthetic aperture radar (SAR) imagery, by supporting knowledge transfer between the modalities to compensate for individual shortcomings. Recent progress has shown how distributional alignment of neural network embeddings can produce powerful transfer learning models by employing a sliced Wasserstein distance (SWD) loss. We analyze how this method can be applied to Sentinel-1 and -2 satellite imagery and develop several extensions toward making it effective in practice. In an application to few-shot Local Climate Zone (LCZ) prediction, we show that these networks outperform multiple common baselines on datasets with a large number of classes. Further, we provide evidence that instance normalization can significantly stabilize the training process and that explicitly shaping the embedding space using supervised contrastive learning can lead to improved performance.
Entropy Approximation by Machine Learning Regression: Application for Irregularity Evaluation of Images in Remote Sensing
Velichko, Andrei, Belyaev, Maksim, Wagner, Matthias P., Taravat, Alireza
Approximation of entropies of various types using machine learning (ML) regression methods are shown for the first time. The ML models presented in this study define the complexity of the short time series by approximating dissimilar entropy techniques such as Singular value decomposition entropy (SvdEn), Permutation entropy (PermEn), Sample entropy (SampEn) and Neural Network entropy (NNetEn) and their 2D analogies. A new method for calculating SvdEn2D, PermEn2D and SampEn2D for 2D images was tested using the technique of circular kernels. Training and testing datasets on the basis of Sentinel-2 images are presented (two training images and one hundred and ninety-eight testing images). The results of entropy approximation are demonstrated using the example of calculating the 2D entropy of Sentinel-2 images and R^2 metric evaluation. The applicability of the method for the short time series with a length from N = 5 to N = 113 elements is shown. A tendency for the R^2 metric to decrease with an increase in the length of the time series was found. For SvdEn entropy, the regression accuracy is R^2 > 0.99 for N = 5 and R^2 > 0.82 for N = 113. The best metrics were observed for the ML_SvdEn2D and ML_NNetEn2D models. The results of the study can be used for fundamental research of entropy approximations of various types using ML regression, as well as for accelerating entropy calculations in remote sensing. The versatility of the model is shown on a synthetic chaotic time series using Planck map and logistic map.
Conditional Progressive Generative Adversarial Network for satellite image generation
Cardoso, Renato, Vallecorsa, Sofia, Nemni, Edoardo
Image generation and image completion are rapidly evolving fields, thanks to machine learning algorithms that are able to realistically replace missing pixels. However, generating large high resolution images, with a large level of details, presents important computational challenges. In this work, we formulate the image generation task as completion of an image where one out of three corners is missing. We then extend this approach to iteratively build larger images with the same level of detail. Our goal is to obtain a scalable methodology to generate high resolution samples typically found in satellite imagery data sets. We introduce a conditional progressive Generative Adversarial Networks (GAN), that generates the missing tile in an image, using as input three initial adjacent tiles encoded in a latent vector by a Wasserstein auto-encoder. We focus on a set of images used by the United Nations Satellite Centre (UNOSAT) to train flood detection tools, and validate the quality of synthetic images in a realistic setup.
Unsupervised Wildfire Change Detection based on Contrastive Learning
Zhang, Beichen, Wang, Huiqi, Alabri, Amani, Bot, Karol, McCall, Cole, Hamilton, Dale, Růžička, Vít
The accurate characterization of the severity of the wildfire event strongly contributes to the characterization of the fuel conditions in fire-prone areas, and provides valuable information for disaster response. The aim of this study is to develop an autonomous system built on top of high-resolution multispectral satellite imagery, with an advanced deep learning method for detecting burned area change. This work proposes an initial exploration of using an unsupervised model for feature extraction in wildfire scenarios. It is based on the contrastive learning technique SimCLR, which is trained to minimize the cosine distance between augmentations of images. The distance between encoded images can also be used for change detection. We propose changes to this method that allows it to be used for unsupervised burned area detection and following downstream tasks. We show that our proposed method outperforms the tested baseline approaches.
High-precision Density Mapping of Marine Debris and Floating Plastics via Satellite Imagery
Booth, Henry, Ma, Wanli, Karakus, Oktay
Combining multi-spectral satellite data and machine learning has been suggested as a method for monitoring plastic pollutants in the ocean environment. Recent studies have made theoretical progress regarding the identification of marine plastic via machine learning. However, no study has assessed the application of these methods for mapping and monitoring marine-plastic density. As such, this paper comprised of three main components: (1) the development of a machine learning model, (2) the construction of the MAP-Mapper, an automated tool for mapping marine-plastic density, and finally (3) an evaluation of the whole system for out-of-distribution test locations. The findings from this paper leverage the fact that machine learning models need to be high-precision to reduce the impact of false positives on results. The developed MAP-Mapper architectures provide users choices to reach high-precision ($\textit{abbv.}$ -HP) or optimum precision-recall ($\textit{abbv.}$ -Opt) values in terms of the training/test data set. Our MAP-Mapper-HP model greatly increased the precision of plastic detection to 95\%, whilst MAP-Mapper-Opt reaches precision-recall pair of 87\%-88\%. The MAP-Mapper contributes to the literature with the first tool to exploit advanced deep/machine learning and multi-spectral imagery to map marine-plastic density in automated software. The proposed data pipeline has taken a novel approach to map plastic density in ocean regions. As such, this enables an initial assessment of the challenges and opportunities of this method to help guide future work and scientific study.