Geophysical Analysis & Survey
Deep Learning Enhanced Road Traffic Analysis: Scalable Vehicle Detection and Velocity Estimation Using PlanetScope Imagery
Adamiak, Maciej, Grinblat, Yulia, Psotta, Julian, Fulman, Nir, Mazumdar, Himshikhar, Tang, Shiyu, Zipf, Alexander
This paper presents a method for detecting and estimating vehicle speeds using PlanetScope SuperDove satellite imagery, offering a scalable solution for global vehicle traffic monitoring. Conventional methods such as stationary sensors and mobile systems like UAVs are limited in coverage and constrained by high costs and legal restrictions. Satellite-based approaches provide broad spatial coverage but face challenges, including high costs, low frame rates, and difficulty detecting small vehicles in high-resolution imagery. We propose a Keypoint R-CNN model to track vehicle trajectories across RGB bands, leveraging band timing differences to estimate speed. Validation is performed using drone footage and GPS data covering highways in Germany and Poland. Our model achieved a Mean Average Precision of 0.53 and velocity estimation errors of approximately 3.4 m/s compared to GPS data. Results from drone comparison reveal underestimations, with average speeds of 112.85 km/h for satellite data versus 131.83 km/h from drone footage. While challenges remain with high-speed accuracy, this approach demonstrates the potential for scalable, daily traffic monitoring across vast areas, providing valuable insights into global traffic dynamics.
Lightning UQ Box: A Comprehensive Framework for Uncertainty Quantification in Deep Learning
Lehmann, Nils, Gawlikowski, Jakob, Stewart, Adam J., Jancauskas, Vytautas, Depeweg, Stefan, Nalisnick, Eric, Gottschling, Nina Maria
Uncertainty quantification (UQ) is an essential tool for applying deep neural networks (DNNs) to real world tasks, as it attaches a degree of confidence to DNN outputs. However, despite its benefits, UQ is often left out of the standard DNN workflow due to the additional technical knowledge required to apply and evaluate existing UQ procedures. Hence there is a need for a comprehensive toolbox that allows the user to integrate UQ into their modelling workflow, without significant overhead. We introduce \texttt{Lightning UQ Box}: a unified interface for applying and evaluating various approaches to UQ. In this paper, we provide a theoretical and quantitative comparison of the wide range of state-of-the-art UQ methods implemented in our toolbox. We focus on two challenging vision tasks: (i) estimating tropical cyclone wind speeds from infrared satellite imagery and (ii) estimating the power output of solar panels from RGB images of the sky. By highlighting the differences between methods our results demonstrate the need for a broad and approachable experimental framework for UQ, that can be used for benchmarking UQ methods. The toolbox, example implementations, and further information are available at: https://github.com/lightning-uq-box/lightning-uq-box
A Robust Multisource Remote Sensing Image Matching Method Utilizing Attention and Feature Enhancement Against Noise Interference
Li, Yuan, Wu, Dapeng, Cui, Yaping, He, Peng, Zhang, Yuan, Wang, Ruyan
Image matching is a fundamental and critical task of multisource remote sensing image applications. However, remote sensing images are susceptible to various noises. Accordingly, how to effectively achieve accurate matching in noise images is a challenging problem. To solve this issue, we propose a robust multisource remote sensing image matching method utilizing attention and feature enhancement against noise interference. In the first stage, we combine deep convolution with the attention mechanism of transformer to perform dense feature extraction, constructing feature descriptors with higher discriminability and robustness. Subsequently, we employ a coarse-to-fine matching strategy to achieve dense matches. In the second stage, we introduce an outlier removal network based on a binary classification mechanism, which can establish effective and geometrically consistent correspondences between images; through weighting for each correspondence, inliers vs. outliers classification are performed, as well as removing outliers from dense matches. Ultimately, we can accomplish more efficient and accurate matches. To validate the performance of the proposed method, we conduct experiments using multisource remote sensing image datasets for comparison with other state-of-the-art methods under different scenarios, including noise-free, additive random noise, and periodic stripe noise. Comparative results indicate that the proposed method has a more well-balanced performance and robustness. The proposed method contributes a valuable reference for solving the difficult problem of noise image matching.
Center-fixing of tropical cyclones using uncertainty-aware deep learning applied to high-temporal-resolution geostationary satellite imagery
Lagerquist, Ryan, Chirokova, Galina, DeMaria, Robert, DeMaria, Mark, Ebert-Uphoff, Imme
Determining the location of a tropical cyclone's (TC) surface circulation center -- "center-fixing" -- is a critical first step in the TC-forecasting process, affecting current and future estimates of track, intensity, and structure. Despite a recent increase in the number of automated center-fixing methods, only one such method (ARCHER-2) is operational, and its best performance is achieved when using microwave or scatterometer data, which are not available at every forecast cycle. We develop a deep-learning algorithm called GeoCenter; it relies only on geostationary IR satellite imagery, which is available for all TC basins at high frequency (10-15 min) and low latency (< 10 min) during both day and night. GeoCenter ingests an animation (time series) of IR images, including 10 channels at lag times up to 3 hours. The animation is centered at a "first guess" location, offset from the true TC-center location by 48 km on average and sometimes > 100 km; GeoCenter is tasked with correcting this offset. On an independent testing dataset, GeoCenter achieves a mean/median/RMS (root mean square) error of 26.9/23.3/32.0 km for all systems, 25.7/22.3/30.5 km for tropical systems, and 15.7/13.6/18.6 km for category-2--5 hurricanes. These values are similar to ARCHER-2 errors when microwave or scatterometer data are available, and better than ARCHER-2 errors when only IR data are available. GeoCenter also performs skillful uncertainty quantification (UQ), producing a well calibrated ensemble of 200 TC-center locations. Furthermore, all predictors used by GeoCenter are available in real time, which would make GeoCenter easy to implement operationally every 10-15 min.
Development and Application of a Sentinel-2 Satellite Imagery Dataset for Deep-Learning Driven Forest Wildfire Detection
Martin, Valeria, Venable, K. Brent, Morgan, Derek
Forest loss due to natural events, such as wildfires, represents an increasing global challenge that demands advanced analytical methods for effective detection and mitigation. To this end, the integration of satellite imagery with deep learning (DL) methods has become essential. Nevertheless, this approach requires substantial amounts of labeled data to produce accurate results. In this study, we use bi-temporal Sentinel-2 satellite imagery sourced from Google Earth Engine (GEE) to build the California Wildfire GeoImaging Dataset (CWGID), a high-resolution labeled satellite imagery dataset with over 100,000 labeled before and after forest wildfire image pairs for wildfire detection through DL. Our methods include data acquisition from authoritative sources, data processing, and an initial dataset analysis using three pre-trained Convolutional Neural Network (CNN) architectures. Our results show that the EF EfficientNet-B0 model achieves the highest accuracy of over 92% in detecting forest wildfires. The CWGID and the methodology used to build it, prove to be a valuable resource for training and testing DL architectures for forest wildfire detection.
RingMo-Aerial: An Aerial Remote Sensing Foundation Model With A Affine Transformation Contrastive Learning
Diao, Wenhui, Yu, Haichen, Kang, Kaiyue, Ling, Tong, Liu, Di, Feng, Yingchao, Bi, Hanbo, Ren, Libo, Li, Xuexue, Mao, Yongqiang, Sun, Xian
Aerial Remote Sensing (ARS) vision tasks pose significant challenges due to the unique characteristics of their viewing angles. Existing research has primarily focused on algorithms for specific tasks, which have limited applicability in a broad range of ARS vision applications. This paper proposes the RingMo-Aerial model, aiming to fill the gap in foundation model research in the field of ARS vision. By introducing the Frequency-Enhanced Multi-Head Self-Attention (FE-MSA) mechanism and an affine transformation-based contrastive learning pre-training method, the model's detection capability for small targets is enhanced and optimized for the tilted viewing angles characteristic of ARS. Furthermore, the ARS-Adapter, an efficient parameter fine-tuning method, is proposed to improve the model's adaptability and effectiveness in various ARS vision tasks. Experimental results demonstrate that RingMo-Aerial achieves SOTA performance on multiple downstream tasks. This indicates the practicality and effectiveness of RingMo-Aerial in enhancing the performance of ARS vision tasks.
A Novel Adaptive Fine-Tuning Algorithm for Multimodal Models: Self-Optimizing Classification and Selection of High-Quality Datasets in Remote Sensing
Ren, Yi, Zhang, Tianyi, Han, Zhixiong, Li, Weibin, Wang, Zhiyang, Ji, Wenbo, Qin, Chenhao, Liang, Chenbin, Jiao, Licheng
We propose an adaptive fine-tuning algorithm for multimodal large models. The core steps of this algorithm involve two stages of truncation. First, the vast amount of data is projected into a semantic vector space, and the MiniBatchKMeans algorithm is used for automated clustering. This classification ensures that the data within each cluster exhibit high semantic similarity. Next, we process the data in each cluster, calculating the translational difference between the original and perturbed data in the multimodal large model's vector space. This difference serves as a generalization metric for the data. Based on this metric, we select the data with high generalization potential for training. We applied this algorithm to train the InternLM-XComposer2-VL-7B model on two 3090 GPUs using one-third of the GeoChat multimodal remote sensing dataset. The results demonstrate that our algorithm outperforms the state-of-the-art baselines. various baselines. The model trained on our optimally chosen one-third dataset, based on experimental validation, exhibited only 1% reduction in performance across various remote sensing metrics compared to the model trained on the full dataset. This approach significantly preserved general-purpose capabilities while reducing training time by 68.2%. Furthermore, the model achieved scores of 89.86 and 77.19 on the UCMerced and AID evaluation datasets, respectively, surpassing the GeoChat dataset by 5.43 and 5.16 points. It only showed a 0.91-point average decrease on the LRBEN evaluation dataset.
Bushfire Severity Modelling and Future Trend Prediction Across Australia: Integrating Remote Sensing and Machine Learning
Partheepan, Shouthiri, Sanati, Farzad, Hassan, Jahan
Bushfire is one of the major natural disasters that cause huge losses to livelihoods and the environment. Understanding and analyzing the severity of bushfires is crucial for effective management and mitigation strategies, helping to prevent the extensive damage and loss caused by these natural disasters. This study presents an in-depth analysis of bushfire severity in Australia over the last twelve years, combining remote sensing data and machine learning techniques to predict future fire trends. By utilizing Landsat imagery and integrating spectral indices like NDVI, NBR, and Burn Index, along with topographical and climatic factors, we developed a robust predictive model using XGBoost. The model achieved high accuracy, 86.13%, demonstrating its effectiveness in predicting fire severity across diverse Australian ecosystems. By analyzing historical trends and integrating factors such as population density and vegetation cover, we identify areas at high risk of future severe bushfires. Additionally, this research identifies key regions at risk, providing data-driven recommendations for targeted firefighting efforts. The findings contribute valuable insights into fire management strategies, enhancing resilience to future fire events in Australia. Also, we propose future work on developing a UAV-based swarm coordination model to enhance fire prediction in real-time and firefighting capabilities in the most vulnerable regions.
BAFNet: Bilateral Attention Fusion Network for Lightweight Semantic Segmentation of Urban Remote Sensing Images
Large-scale semantic segmentation networks often achieve high performance, while their application can be challenging when faced with limited sample sizes and computational resources. In scenarios with restricted network size and computational complexity, models encounter significant challenges in capturing long-range dependencies and recovering detailed information in images. We propose a lightweight bilateral semantic segmentation network called bilateral attention fusion network (BAFNet) to efficiently segment high-resolution urban remote sensing images. The model consists of two paths, namely dependency path and remote-local path. The dependency path utilizes large kernel attention to acquire long-range dependencies in the image. Besides, multi-scale local attention and efficient remote attention are designed to construct remote-local path. Finally, a feature aggregation module is designed to effectively utilize the different features of the two paths. Our proposed method was tested on public high-resolution urban remote sensing datasets Vaihingen and Potsdam, with mIoU reaching 83.20% and 86.53%, respectively. As a lightweight semantic segmentation model, BAFNet not only outperforms advanced lightweight models in accuracy but also demonstrates comparable performance to non-lightweight state-of-the-art methods on two datasets, despite a tenfold variance in floating-point operations and a fifteenfold difference in network parameters.
Deep Learning tools to support deforestation monitoring in the Ivory Coast using SAR and Optical satellite imagery
Sartor, Gabriele, Salis, Matteo, Pinardi, Stefano, Saracik, Ozgur, Meo, Rosa
Deforestation is gaining an increasingly importance due to its strong influence on the sorrounding environment, especially in developing countries where population has a disadvantaged economic condition and agriculture is the main source of income. In Ivory Coast, for instance, where the cocoa production is the most remunerative activity, it is not rare to assist to the replacement of portion of ancient forests with new cocoa plantations. In order to monitor this type of deleterious activities, satellites can be employed to recognize the disappearance of the forest to prevent it from expand its area of interest. In this study, Forest-Non-Forest map (FNF) has been used as ground truth for models based on Sentinel images input. State-of-the-art models U-Net, Attention U-Net, Segnet and FCN32 are compared over different years combining Sentinel-1, Sentinel-2 and cloud probability to create forest/non-forest segmentation. Although Ivory Coast lacks of forest coverage datasets and is partially covered by Sentinel images, it is demonstrated the feasibility to create models classifying forest and non-forests pixels over the area using open datasets to predict where deforestation could have occurred. Although a significant portion of the deforestation research is carried out on visible bands, SAR acquisitions are employed to overcome the limits of RGB images over areas often covered by clouds. Finally, the most promising model is employed to estimate the hectares of forest has been cut between 2019 and 2020.