Goto

Collaborating Authors

 Geophysical Analysis & Survey


MultiScale Probability Map guided Index Pooling with Attention-based learning for Road and Building Segmentation

arXiv.org Artificial Intelligence

Efficient road and building footprint extraction from satellite images are predominant in many remote sensing applications. However, precise segmentation map extraction is quite challenging due to the diverse building structures camouflaged by trees, similar spectral responses between the roads and buildings, and occlusions by heterogeneous traffic over the roads. Existing convolutional neural network (CNN)-based methods focus on either enriched spatial semantics learning for the building extraction or the fine-grained road topology extraction. The profound semantic information loss due to the traditional pooling mechanisms in CNN generates fragmented and disconnected road maps and poorly segmented boundaries for the densely spaced small buildings in complex surroundings. In this paper, we propose a novel attention-aware segmentation framework, Multi-Scale Supervised Dilated Multiple-Path Attention Network (MSSDMPA-Net), equipped with two new modules Dynamic Attention Map Guided Index Pooling (DAMIP) and Dynamic Attention Map Guided Spatial and Channel Attention (DAMSCA) to precisely extract the building footprints and road maps from remotely sensed images. DAMIP mines the salient features by employing a novel index pooling mechanism to retain important geometric information. On the other hand, DAMSCA simultaneously extracts the multi-scale spatial and spectral features. Besides, using dilated convolution and multi-scale deep supervision in optimizing MSSDMPA-Net helps achieve stellar performance. Experimental results over multiple benchmark building and road extraction datasets, ensures MSSDMPA-Net as the state-of-the-art (SOTA) method for building and road extraction.


Estimating Causal Effects Under Image Confounding Bias with an Application to Poverty in Africa

arXiv.org Artificial Intelligence

Observational studies of causal effects require adjustment for confounding factors. In the tabular setting, where these factors are well-defined, separate random variables, the effect of confounding is well understood. However, in public policy, ecology, and in medicine, decisions are often made in non-tabular settings, informed by patterns or objects detected in images (e.g., maps, satellite or tomography imagery). Using such imagery for causal inference presents an opportunity because objects in the image may be related to the treatment and outcome of interest. In these cases, we rely on the images to adjust for confounding but observed data do not directly label the existence of the important objects. Motivated by real-world applications, we formalize this challenge, how it can be handled, and what conditions are sufficient to identify and estimate causal effects. We analyze finite-sample performance using simulation experiments, estimating effects using a propensity adjustment algorithm that employs a machine learning model to estimate the image confounding. Our experiments also examine sensitivity to misspecification of the image pattern mechanism. Finally, we use our methodology to estimate the effects of policy interventions on poverty in African communities from satellite imagery.


Tracking the industrial growth of modern China with high-resolution panchromatic imagery: A sequential convolutional approach

arXiv.org Artificial Intelligence

Satellite imagery analysis using deep learning methods, specifically convolutional neural networks (CNNs), has grown in popularity since 2012, with uses extending into the estimation of population [1], wealth [2], poverty [3], conflict [4], migration [5], education [6], and infrastructure [7], among other applications [8, 9, 10, 11]. These techniques have broadly illustrated that harnessing satellites to remotely track development over time in otherwise data sparse regions is a potentially effective strategy [12]. One currently untested application of deep learning with satellite imagery is the identification and monitoring of industrial sites (e.g., factories, power plants, ports). The development of industrial sites is of broad interest, as it can serve as a proxy for everything from economic development [13] to the projection of soft power [14]. Because of its interrelationship with national security or proprietary corporate interests, information on such large-scale development is often undocumented or difficult to obtain openly by interested parties. This article focuses on testing our capability to automatically detect and monitor industrial sites within China using high-resolution panchromatic satellite imagery. Largely unrecorded in structured open source text information, the size and extent of industrial sites in China can be observed through routine or targeted satellite collection. From select sources, many locations appear, on average, at least yearly in cloud-free high-resolution imagery from satellite-based sensors over the past 15 years; some locations of interest have temporal granularity of as high as one day. To-date, no work has explored the use of machine learning methods trained on satellite imagery to estimate, and monitor over time, the development of particular economic industries at the scale of individual sites.


Threatening Patch Attacks on Object Detection in Optical Remote Sensing Images

arXiv.org Artificial Intelligence

Advanced Patch Attacks (PAs) on object detection in natural images have pointed out the great safety vulnerability in methods based on deep neural networks. However, little attention has been paid to this topic in Optical Remote Sensing Images (O-RSIs). To this end, we focus on this research, i.e., PAs on object detection in O-RSIs, and propose a more Threatening PA without the scarification of the visual quality, dubbed TPA. Specifically, to address the problem of inconsistency between local and global landscapes in existing patch selection schemes, we propose leveraging the First-Order Difference (FOD) of the objective function before and after masking to select the sub-patches to be attacked. Further, considering the problem of gradient inundation when applying existing coordinate-based loss to PAs directly, we design an IoU-based objective function specific for PAs, dubbed Bounding box Drifting Loss (BDL), which pushes the detected bounding boxes far from the initial ones until there are no intersections between them. Finally, on two widely used benchmarks, i.e., DIOR and DOTA, comprehensive evaluations of our TPA with four typical detectors (Faster R-CNN, FCOS, RetinaNet, and YOLO-v4) witness its remarkable effectiveness. To the best of our knowledge, this is the first attempt to study the PAs on object detection in O-RSIs, and we hope this work can get our readers interested in studying this topic.


Unsupervised Seismic Footprint Removal With Physical Prior Augmented Deep Autoencoder

arXiv.org Artificial Intelligence

Seismic acquisition footprints appear as stably faint and dim structures and emerge fully spatially coherent, causing inevitable damage to useful signals during the suppression process. Various footprint removal methods, including filtering and sparse representation (SR), have been reported to attain promising results for surmounting this challenge. However, these methods, e.g., SR, rely solely on the handcrafted image priors of useful signals, which is sometimes an unreasonable demand if complex geological structures are contained in the given seismic data. As an alternative, this article proposes a footprint removal network (dubbed FR-Net) for the unsupervised suppression of acquired footprints without any assumptions regarding valuable signals. The key to the FR-Net is to design a unidirectional total variation (UTV) model for footprint acquisition according to the intrinsically directional property of noise. By strongly regularizing a deep convolutional autoencoder (DCAE) using the UTV model, our FR-Net transforms the DCAE from an entirely data-driven model to a \textcolor{black}{prior-augmented} approach, inheriting the superiority of the DCAE and our footprint model. Subsequently, the complete separation of the footprint noise and useful signals is projected in an unsupervised manner, specifically by optimizing the FR-Net via the backpropagation (BP) algorithm. We provide qualitative and quantitative evaluations conducted on three synthetic and field datasets, demonstrating that our FR-Net surpasses the previous state-of-the-art (SOTA) methods.


Mask Conditional Synthetic Satellite Imagery

arXiv.org Artificial Intelligence

In this paper we propose a mask-conditional synthetic image generation model for creating synthetic satellite imagery datasets. Given a dataset of real high-resolution images and accompanying land cover masks, we show that it is possible to train an upstream conditional synthetic imagery generator, use that generator to create synthetic imagery with the land cover masks, then train a downstream model on the synthetic imagery and land cover masks that achieves similar test performance to a model that was trained with the real imagery. Further, we find that incorporating a mixture of real and synthetic imagery acts as a data augmentation method, producing better models than using only real imagery (0.5834 vs. 0.5235 mIoU). Finally, we find that encouraging diversity of outputs in the upstream model is a necessary component for improved downstream task performance. We have released code for reproducing our work on GitHub.


An End-to-End Two-Phase Deep Learning-Based workflow to Segment Man-made Objects Around Reservoirs

arXiv.org Artificial Intelligence

Reservoirs are fundamental infrastructures for the management of water resources. Constructions around them can negatively impact their quality. Such unauthorized constructions can be monitored by land cover mapping (LCM) remote sensing (RS) images. In this paper, we develop a new approach based on DL and image processing techniques for man-made object segmentation around the reservoirs. In order to segment man-made objects around the reservoirs in an end-to-end procedure, segmenting reservoirs and identifying the region of interest (RoI) around them are essential. In the proposed two-phase workflow, the reservoir is initially segmented using a DL model. A post-processing stage is proposed to remove errors such as floating vegetation. Next, the RoI around the reservoir (RoIaR) is identified using the proposed image processing techniques. Finally, the man-made objects in the RoIaR are segmented using a DL architecture. We trained the proposed workflow using collected Google Earth (GE) images of eight reservoirs in Brazil over two different years. The U-Net-based and SegNet-based architectures are trained to segment the reservoirs. To segment man-made objects in the RoIaR, we trained and evaluated four possible architectures, U-Net, FPN, LinkNet, and PSPNet. Although the collected data has a high diversity (for example, they belong to different states, seasons, resolutions, etc.), we achieved good performances in both phases. Furthermore, applying the proposed post-processing to the output of reservoir segmentation improves the precision in all studied reservoirs except two cases. We validated the prepared workflow with a reservoir dataset outside the training reservoirs. The results show high generalization ability of the prepared workflow.


Novel Building Detection and Location Intelligence Collection in Aerial Satellite Imagery

arXiv.org Artificial Intelligence

Building structures detection and information about these buildings in aerial images is an important solution for city planning and management, land use analysis. It can be the center piece to answer important questions such as planning evacuation routes in case of an earthquake, flood management, etc. These applications rely on being able to accurately retrieve up-to-date information. Being able to accurately detect buildings in a bounding box centered on a specific latitude-longitude value can help greatly. The key challenge is to be able to detect buildings which can be commercial, industrial, hut settlements, or skyscrapers. Once we are able to detect such buildings, our goal will be to cluster and categorize similar types of buildings together.


Example-Based Explainable AI and its Application for Remote Sensing Image Classification

arXiv.org Artificial Intelligence

We present a method of explainable artificial intelligence (XAI), "What I Know (WIK)", to provide additional information to verify the reliability of a deep learning model by showing an example of an instance in a training dataset that is similar to the input data to be inferred and demonstrate it in a remote sensing image classification task. One of the expected roles of XAI methods is verifying whether inferences of a trained machine learning model are valid for an application, and it is an important factor that what datasets are used for training the model as well as the model architecture. Our data-centric approach can help determine whether the training dataset is sufficient for each inference by checking the selected example data. If the selected example looks similar to the input data, we can confirm that the model was not trained on a dataset with a feature distribution far from the feature of the input data. With this method, the criteria for selecting an example are not merely data similarity with the input data but also data similarity in the context of the model task. Using a remote sensing image dataset from the Sentinel-2 satellite, the concept was successfully demonstrated with reasonably selected examples. This method can be applied to various machine-learning tasks, including classification and regression.


Remote Sensing

#artificialintelligence

Due to its relation to the Earth’s climate and weather and phenomena like drought, flooding, or landslides, knowledge of the soil moisture content is valuable to many scientific and professional users. Remote-sensing offers the unique possibility for continuous measurements of this variable. Especially for agriculture, there is a strong demand for high spatial resolution mapping. However, operationally available soil moisture products exist with medium to coarse spatial resolution only (≥1 km). This study introduces a machine learning (ML)—based approach for the high spatial resolution (50 m) mapping of soil moisture based on the integration of Landsat-8 optical and thermal images, Copernicus Sentinel-1 C-Band SAR images, and modelled data, executable in the Google Earth Engine. The novelty of this approach lies in applying an entirely data-driven ML concept for global estimation of the surface soil moisture content. Globally distributed in situ data from the International Soil Moisture Network acted as an input for model training. Based on the independent validation dataset, the resulting overall estimation accuracy, in terms of Root-Mean-Squared-Error and R², was 0.04 m3·m−3 and 0.81, respectively. Beyond the retrieval model itself, this article introduces a framework for collecting training data and a stand-alone Python package for soil moisture mapping. The Google Earth Engine Python API facilitates the execution of data collection and retrieval which is entirely cloud-based. For soil moisture retrieval, it eliminates the requirement to download or preprocess any input datasets.