Geophysical Analysis & Survey
Deep Metric Learning for Unsupervised Remote Sensing Change Detection
Bandara, Wele Gedara Chaminda, Patel, Vishal M.
Remote Sensing Change Detection (RS-CD) aims to detect relevant changes from Multi-Temporal Remote Sensing Images (MT-RSIs), which aids in various RS applications such as land cover, land use, human development analysis, and disaster response. The performance of existing RS-CD methods is attributed to training on large annotated datasets. Furthermore, most of these models are less transferable in the sense that the trained model often performs very poorly when there is a domain gap between training and test datasets. This paper proposes an unsupervised CD method based on deep metric learning that can deal with both of these issues. Given an MT-RSI, the proposed method generates corresponding change probability map by iteratively optimizing an unsupervised CD loss without training it on a large dataset. Our unsupervised CD method consists of two interconnected deep networks, namely Deep-Change Probability Generator (D-CPG) and Deep-Feature Extractor (D-FE). The D-CPG is designed to predict change and no change probability maps for a given MT-RSI, while D-FE is used to extract deep features of MT-RSI that will be further used in the proposed unsupervised CD loss. We use transfer learning capability to initialize the parameters of D-FE. We iteratively optimize the parameters of D-CPG and D-FE for a given MT-RSI by minimizing the proposed unsupervised ``similarity-dissimilarity loss''. This loss is motivated by the principle of metric learning where we simultaneously maximize the distance between change pair-wise pixels while minimizing the distance between no-change pair-wise pixels in bi-temporal image domain and their deep feature domain. The experiments conducted on three CD datasets show that our unsupervised CD method achieves significant improvements over the state-of-the-art supervised and unsupervised CD methods. Code available at https://github.com/wgcban/Metric-CD
Implicit Ray-Transformers for Multi-view Remote Sensing Image Segmentation
Qi, Zipeng, Chen, Hao, Liu, Chenyang, Shi, Zhenwei, Zou, Zhengxia
The mainstream CNN-based remote sensing (RS) image semantic segmentation approaches typically rely on massive labeled training data. Such a paradigm struggles with the problem of RS multi-view scene segmentation with limited labeled views due to the lack of considering 3D information within the scene. In this paper, we propose ''Implicit Ray-Transformer (IRT)'' based on Implicit Neural Representation (INR), for RS scene semantic segmentation with sparse labels (such as 4-6 labels per 100 images). We explore a new way of introducing multi-view 3D structure priors to the task for accurate and view-consistent semantic segmentation. The proposed method includes a two-stage learning process. In the first stage, we optimize a neural field to encode the color and 3D structure of the remote sensing scene based on multi-view images. In the second stage, we design a Ray Transformer to leverage the relations between the neural field 3D features and 2D texture features for learning better semantic representations. Different from previous methods that only consider 3D prior or 2D features, we incorporate additional 2D texture information and 3D prior by broadcasting CNN features to different point features along the sampled ray. To verify the effectiveness of the proposed method, we construct a challenging dataset containing six synthetic sub-datasets collected from the Carla platform and three real sub-datasets from Google Maps. Experiments show that the proposed method outperforms the CNN-based methods and the state-of-the-art INR-based segmentation methods in quantitative and qualitative metrics.
Towards Targeted Change Detection with Heterogeneous Remote Sensing Images for Forest Mortality Mapping
Agersborg, Jørgen A., Luppino, Luigi T., Anfinsen, Stian Normann, Jepsen, Jane Uhd
Several generic methods have recently been developed for change detection in heterogeneous remote sensing data, such as images from synthetic aperture radar (SAR) and multispectral radiometers. However, these are not well suited to detect weak signatures of certain disturbances of ecological systems. To resolve this problem we propose a new approach based on image-to-image translation and one-class classification (OCC). We aim to map forest mortality caused by an outbreak of geometrid moths in a sparsely forested forest-tundra ecotone using multisource satellite images. The images preceding and following the event are collected by Landsat-5 and RADARSAT-2, respectively. Using a recent deep learning method for change-aware image translation, we compute difference images in both satellites' respective domains. These differences are stacked with the original pre- and post-event images and passed to an OCC trained on a small sample from the targeted change class. The classifier produces a credible map of the complex pattern of forest mortality.
Generation of non-stationary stochastic fields using Generative Adversarial Networks
Abdellatif, Alhasan, Elsheikh, Ahmed H., Busby, Daniel, Berthet, Philippe
In the context of generating geological facies conditioned on observed data, samples corresponding to all possible conditions are not generally available in the training set and hence the generation of these realizations depends primary on the generalization capability of the trained generative model. The problem becomes more complex when applied on non-stationary fields. In this work, we investigate the problem of using Generative Adversarial Networks (GANs) models to generate non-stationary geological channelized patterns and examine the models generalization capability at new spatial modes that were never seen in the given training set. The developed training method based on spatial-conditioning allowed for effective learning of the correlation between the spatial conditions (i.e. non-stationary maps) and the realizations implicitly without using additional loss terms or solving optimization problems for every new given data after training. In addition, our models can be trained on 2D and 3D samples. The results on real and artificial datasets show that we were able to generate geologically-plausible realizations beyond the training samples and with a strong correlation with the target maps.
Deep hybrid model with satellite imagery: how to combine demand modeling and computer vision for behavior analysis?
Wang, Qingyi, Wang, Shenhao, Zheng, Yunhan, Lin, Hongzhou, Zhang, Xiaohu, Zhao, Jinhua, Walker, Joan
Classical demand modeling analyzes travel behavior using only low-dimensional numeric data (i.e. sociodemographics and travel attributes) but not high-dimensional urban imagery. However, travel behavior depends on the factors represented by both numeric data and urban imagery, thus necessitating a synergetic framework to combine them. This study creates a theoretical framework of deep hybrid models with a crossing structure consisting of a mixing operator and a behavioral predictor, thus integrating the numeric and imagery data into a latent space. Empirically, this framework is applied to analyze travel mode choice using the MyDailyTravel Survey from Chicago as the numeric inputs and the satellite images as the imagery inputs. We found that deep hybrid models outperform both the traditional demand models and the recent deep learning in predicting the aggregate and disaggregate travel behavior with our supervision-as-mixing design. The latent space in deep hybrid models can be interpreted, because it reveals meaningful spatial and social patterns. The deep hybrid models can also generate new urban images that do not exist in reality and interpret them with economic theory, such as computing substitution patterns and social welfare changes. Overall, the deep hybrid models demonstrate the complementarity between the low-dimensional numeric and high-dimensional imagery data and between the traditional demand modeling and recent deep learning. It generalizes the latent classes and variables in classical hybrid demand models to a latent space, and leverages the computational power of deep learning for imagery while retaining the economic interpretability on the microeconomics foundation.
Novel Machine Learning Approach for Predicting Poverty using Temperature and Remote Sensing Data in Ethiopia
In many developing nations, a lack of poverty data prevents critical humanitarian organizations from responding to large-scale crises. Currently, socioeconomic surveys are the only method implemented on a large scale for organizations and researchers to measure and track poverty. However, the inability to collect survey data efficiently and inexpensively leads to significant temporal gaps in poverty data; these gaps severely limit the ability of organizational entities to address poverty at its root cause. We propose a transfer learning model based on surface temperature change and remote sensing data to extract features useful for predicting poverty rates. Machine learning, supported by data sources of poverty indicators, has the potential to estimate poverty rates accurately and within strict time constraints. Higher temperatures, as a result of climate change, have caused numerous agricultural obstacles, socioeconomic issues, and environmental disruptions, trapping families in developing countries in cycles of poverty. To find patterns of poverty relating to temperature that have the highest influence on spatial poverty rates, we use remote sensing data. The two-step transfer model predicts the temperature delta from high resolution satellite imagery and then extracts image features useful for predicting poverty. The resulting model achieved 80% accuracy on temperature prediction. This method takes advantage of abundant satellite and temperature data to measure poverty in a manner comparable to the existing survey methods and exceeds similar models of poverty prediction.
Improving Representational Continuity via Continued Pretraining
Sun, Michael, Kumar, Ananya, Madaan, Divyam, Liang, Percy
We consider the continual representation learning setting: sequentially pretrain a model $M'$ on tasks $T_1, \ldots, T_T$, and then adapt $M'$ on a small amount of data from each task $T_i$ to check if it has forgotten information from old tasks. Under a kNN adaptation protocol, prior work shows that continual learning methods improve forgetting over naive training (SGD). In reality, practitioners do not use kNN classifiers -- they use the adaptation method that works best (e.g., fine-tuning) -- here, we find that strong continual learning baselines do worse than naive training. Interestingly, we find that a method from the transfer learning community (LP-FT) outperforms naive training and the other continual learning methods. Even with standard kNN evaluation protocols, LP-FT performs comparably with strong continual learning methods (while being simpler and requiring less memory) on three standard benchmarks: sequential CIFAR-10, CIFAR-100, and TinyImageNet. LP-FT also reduces forgetting in a real world satellite remote sensing dataset (FMoW), and a variant of LP-FT gets state-of-the-art accuracies on an NLP continual learning benchmark.
A Light-weight Deep Learning Model for Remote Sensing Image Classification
Pham, Lam, Le, Cam, Ngo, Dat, Nguyen, Anh, Lampert, Jasmin, Schindler, Alexander, McLoughlin, Ian
In this paper, we present a high-performance and light-weight deep learning model for Remote Sensing Image Classification (RSIC), the task of identifying the aerial scene of a remote sensing image. To this end, we first valuate various benchmark convolutional neural network (CNN) architectures: MobileNet V1/V2, ResNet 50/151V2, InceptionV3/InceptionResNetV2, EfficientNet B0/B7, DenseNet 121/201, ConNeXt Tiny/Large. Then, the best performing models are selected to train a compact model in a teacher-student arrangement. The knowledge distillation from the teacher aims to achieve high performance with significantly reduced complexity. By conducting extensive experiments on the NWPU-RESISC45 benchmark, our proposed teacher-student models outperforms the state-of-the-art systems, and has potential to be applied on a wide rage of edge devices.
Deep Learning Based 3D Point Cloud Regression for Estimating Forest Biomass
Oehmcke, Stefan, Li, Lei, Trepekli, Katerina, Revenga, Jaime, Nord-Larsen, Thomas, Gieseke, Fabian, Igel, Christian
Robust quantification of forest carbon stocks and their dynamics is important for climate change mitigation and adaptation strategies [FAO and UNEP, 2020]. The Paris Agreement [United Nations / Framework Convention on Climate Change, 2015] and the IPCC [Shukla et al., 2019] acknowledge that climate change mitigation goals cannot be achieved without a substantial contribution from forests. Spatial details in the carbon budget of forests are necessary to encourage transformational actions towards a sustainable forest sector [Harris et al., 2021, 2012]. Currently, many countries do not have nationally specific forest carbon accumulation rates but rather rely on default rates from the IPCC 2018 [Masson-Delmotte et al., 2019, Requena Suarez et al., 2019]), without accounting for finer-scale variations of carbon stocks [Cook-Patton et al., 2020]. Precise spatio-temporal monitoring of forest carbon dynamics at large scales has proven to be challenging [Erb et al., 2018, Griscom et al., 2017]. This is due to the complex structure of forests, topographic features, and land management practices [Tubiello et al., 2021, Lewis et al., 2019]. Technological developments in remote sensing and the concurrent increased availability of field-based measurements have led to an improvement in estimating carbon stocks using remote sensing observations of forest attributes that serve as proxy for above-ground biomass (AGB) [Knapp et al., 2018, Bouvier et al., 2015, Pan et al., 2013]. Currently, three remote sensing techniques are applied to collect data for AGB estimates: i) passive optical imagery, ii) synthetic aperture radar (SAR), and iii) light detection and ranging (LiDAR).
Physically-Consistent Generative Adversarial Networks for Coastal Flood Visualization
Lütjens, Björn, Leshchinskiy, Brandon, Requena-Mesa, Christian, Chishtie, Farrukh, Díaz-Rodríguez, Natalia, Boulais, Océane, Sankaranarayanan, Aruna, Masson-Forsythe, Margaux, Piña, Aaron, Gal, Yarin, Raïssi, Chedy, Lavin, Alexander, Newman, Dava
As climate change increases the intensity of natural disasters, society needs better tools for adaptation. Floods, for example, are the most frequent natural disaster, and better tools for flood risk communication could increase the support for flood-resilient infrastructure development. Our work aims to enable more visual communication of large-scale climate impacts via visualizing the output of coastal flood models as satellite imagery. We propose the first deep learning pipeline to ensure physical-consistency in synthetic visual satellite imagery. We advanced a state-of-the-art GAN called pix2pixHD, such that it produces imagery that is physically-consistent with the output of an expert-validated storm surge model (NOAA SLOSH). By evaluating the imagery relative to physics-based flood maps, we find that our proposed framework outperforms baseline models in both physical-consistency and photorealism. We envision our work to be the first step towards a global visualization of how the climate challenge will shape our landscape. Continuing on this path, we show that the proposed pipeline generalizes to visualize reforestation. We also publish a dataset of over 25k labelled image-triplets to study image-to-image translation in Earth observation.