cloud mask
- North America > United States (0.14)
- Europe (0.04)
- Asia > China > Fujian Province > Fuzhou (0.04)
- Law (1.00)
- Government (1.00)
- Information Technology > Security & Privacy (0.46)
Temporal-Spatial Tubelet Embedding for Cloud-Robust MSI Reconstruction using MSI-SAR Fusion: A Multi-Head Self-Attention Video Vision Transformer Approach
Wang, Yiqun, Li, Lujun, Yue, Meiru, State, Radu
Cloud cover in multispectral imagery (MSI) significantly hinders early-season crop mapping by corrupting spectral information. Existing Vision Transformer(ViT)-based time-series reconstruction methods, like SMTS-ViT, often employ coarse temporal embeddings that aggregate entire sequences, causing substantial information loss and reducing reconstruction accuracy. To address these limitations, a Video Vision Transformer (ViViT)-based framework with temporal-spatial fusion embedding for MSI reconstruction in cloud-covered regions is proposed in this study. Non-overlapping tubelets are extracted via 3D convolution with constrained temporal span $(t=2)$, ensuring local temporal coherence while reducing cross-day information degradation. Both MSI-only and SAR-MSI fusion scenarios are considered during the experiments. Comprehensive experiments on 2020 Traill County data demonstrate notable performance improvements: MTS-ViViT achieves a 2.23\% reduction in MSE compared to the MTS-ViT baseline, while SMTS-ViViT achieves a 10.33\% improvement with SAR integration over the SMTS-ViT baseline. The proposed framework effectively enhances spectral reconstruction quality for robust agricultural monitoring.
- North America > United States (0.14)
- Europe (0.04)
- Asia > China > Fujian Province > Fuzhou (0.04)
- Law (1.00)
- Government (1.00)
- Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.73)
- Information Technology > Security & Privacy (0.46)
Machine Learning for Cloud Detection in IASI Measurements: A Data-Driven SVM Approach with Physical Constraints
Zugarini, Chiara, Sgattoni, Cristina, Sgheri, Luca
Cloud detection is essential for atmospheric retrievals, climate studies, and weather forecasting. We analyze infrared radiances from the Infrared Atmospheric Sounding Interferometer (IASI) onboard Meteorological Operational (MetOp) satellites to classify scenes as clear or cloudy. We apply the Support Vector Machine (SVM) approach, based on kernel methods for non-separable data. In this study, the method is implemented for Cloud Identification (CISVM) to classify the test set using radiances or brightness temperatures, with dimensionality reduction through Principal Component Analysis (PCA) and cloud-sensitive channel selection to focus on the most informative features. Our best configuration achieves 88.30 percent agreement with reference labels and shows strong consistency with cloud masks from the Moderate Resolution Imaging Spectroradiometer (MODIS), with the largest discrepancies in polar regions due to sensor differences. These results demonstrate that CISVM is a robust, flexible, and efficient method for automated cloud classification from infrared radiances, suitable for operational retrievals and future missions such as Far infrared Outgoing Radiation Understanding and Monitoring (FORUM), the ninth European Space Agency Earth Explorer Mission.
- Europe (1.00)
- North America > United States (0.68)
SpecTf: Transformers Enable Data-Driven Imaging Spectroscopy Cloud Detection
Lee, Jake H., Kiper, Michael, Thompson, David R., Brodrick, Philip G.
Current and upcoming generations of visible-shortwave infrared (VSWIR) imaging spectrometers promise unprecedented capacity to quantify Earth System processes across the globe. However, reliable cloud screening remains a fundamental challenge for these instruments, where traditional spatial and temporal approaches are limited by cloud variability and limited temporal coverage. The Spectroscopic Transformer (SpecTf) addresses these challenges with a spectroscopy-specific deep learning architecture that performs cloud detection using only spectral information (no spatial or temporal data are required). By treating spectral measurements as sequences rather than image channels, SpecTf learns fundamental physical relationships without relying on spatial context. Our experiments demonstrate that SpecTf significantly outperforms the current baseline approach implemented for the EMIT instrument, and performs comparably with other machine learning methods with orders of magnitude fewer learned parameters. Critically, we demonstrate SpecTf's inherent interpretability through its attention mechanism, revealing physically meaningful spectral features the model has learned. Finally, we present SpecTf's potential for cross-instrument generalization by applying it to a different instrument on a different platform without modifications, opening the door to instrument agnostic data driven algorithms for future imaging spectroscopy tasks.
- Government > Regional Government > North America Government > United States Government (0.69)
- Energy (0.48)
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery
Zhou, Hangyu, Kao, Chia-Hsiang, Phoo, Cheng Perng, Mall, Utkarsh, Hariharan, Bharath, Bala, Kavita
Clouds in satellite imagery pose a significant challenge for downstream applications. A major challenge in current cloud removal research is the absence of a comprehensive benchmark and a sufficiently large and diverse training dataset. To address this problem, we introduce the largest public dataset -- $\textit{AllClear}$ for cloud removal, featuring 23,742 globally distributed regions of interest (ROIs) with diverse land-use patterns, comprising 4 million images in total. Each ROI includes complete temporal captures from the year 2022, with (1) multi-spectral optical imagery from Sentinel-2 and Landsat 8/9, (2) synthetic aperture radar (SAR) imagery from Sentinel-1, and (3) auxiliary remote sensing products such as cloud masks and land cover maps. We validate the effectiveness of our dataset by benchmarking performance, demonstrating the scaling law -- the PSNR rises from $28.47$ to $33.87$ with $30\times$ more data, and conducting ablation studies on the temporal length and the importance of individual modalities. This dataset aims to provide comprehensive coverage of the Earth's surface and promote better cloud removal results.
- North America > United States (0.14)
- Europe (0.04)
- Asia > China > Fujian Province > Fuzhou (0.04)
- Law (1.00)
- Information Technology (1.00)
- Government (1.00)
- Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.93)
Explainable Earth Surface Forecasting under Extreme Events
Pellicer-Valero, Oscar J., Fernández-Torres, Miguel-Ángel, Ji, Chaonan, Mahecha, Miguel D., Camps-Valls, Gustau
With climate change-related extreme events on the rise, high dimensional Earth observation data presents a unique opportunity for forecasting and understanding impacts on ecosystems. This is, however, impeded by the complexity of processing, visualizing, modeling, and explaining this data. To showcase how this challenge can be met, here we train a convolutional long short-term memory-based architecture on the novel DeepExtremeCubes dataset. DeepExtremeCubes includes around 40,000 long-term Sentinel-2 minicubes (January 2016-October 2022) worldwide, along with labeled extreme events, meteorological data, vegetation land cover, and topography map, sampled from locations affected by extreme climate events and surrounding areas. When predicting future reflectances and vegetation impacts through kernel normalized difference vegetation index, the model achieved an R$^2$ score of 0.9055 in the test set. Explainable artificial intelligence was used to analyze the model's predictions during the October 2020 Central South America compound heatwave and drought event. We chose the same area exactly one year before the event as counterfactual, finding that the average temperature and surface pressure are generally the best predictors under normal conditions. In contrast, minimum anomalies of evaporation and surface latent heat flux take the lead during the event. A change of regime is also observed in the attributions before the event, which might help assess how long the event was brewing before happening. The code to replicate all experiments and figures in this paper is publicly available at https://github.com/DeepExtremes/txyXAI
- South America (0.49)
- Europe > Germany (0.15)
- Asia (0.14)
Improvements & Evaluations on the MLCommons CloudMask Benchmark
Chennamsetti, Varshitha, Mehnaz, Laiba, Zhao, Dan, Ghosh, Banani, Samsonau, Sergey V.
In this paper, we report the performance benchmarking results of deep learning models on MLCommons' Science cloud-masking benchmark using a high-performance computing cluster at New York University (NYU): NYU Greene. MLCommons is a consortium that develops and maintains several scientific benchmarks that can benefit from developments in AI. We provide a description of the cloud-masking benchmark task, updated code, and the best model for this benchmark when using our selected hyperparameter settings. Our benchmarking results include the highest accuracy achieved on the NYU system as well as the average time taken for both training and inference on the benchmark across several runs/seeds. Our code can be found on GitHub. MLCommons team has been kept informed about our progress and may use the developed code for their future work.
- North America > United States > Virginia (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Western Europe (0.04)
MT-HCCAR: Multi-Task Deep Learning with Hierarchical Classification and Attention-based Regression for Cloud Property Retrieval
Li, Xingyan, Sayer, Andrew M., Carroll, Ian T., Huang, Xin, Wang, Jianwu
In the realm of Earth science, effective cloud property retrieval, encompassing cloud masking, cloud phase classification, and cloud optical thickness (COT) prediction, remains pivotal. Traditional methodologies necessitate distinct models for each sensor instrument due to their unique spectral characteristics. Recent strides in Earth Science research have embraced machine learning and deep learning techniques to extract features from satellite datasets' spectral observations. However, prevailing approaches lack novel architectures accounting for hierarchical relationships among retrieval tasks. Moreover, considering the spectral diversity among existing sensors, the development of models with robust generalization capabilities over different sensor datasets is imperative. Surprisingly, there is a dearth of methodologies addressing the selection of an optimal model for diverse datasets. In response, this paper introduces MT-HCCAR, an end-to-end deep learning model employing multi-task learning to simultaneously tackle cloud masking, cloud phase retrieval (classification tasks), and COT prediction (a regression task). The MT-HCCAR integrates a hierarchical classification network (HC) and a classification-assisted attention-based regression network (CAR), enhancing precision and robustness in cloud labeling and COT prediction. Additionally, a comprehensive model selection method rooted in K-fold cross-validation, one standard error rule, and two introduced performance scores is proposed to select the optimal model over three simulated satellite datasets OCI, VIIRS, and ABI. The experiments comparing MT-HCCAR with baseline methods, the ablation studies, and the model selection affirm the superiority and the generalization capabilities of MT-HCCAR.
- North America > United States > Maryland > Baltimore (0.14)
- North America > United States > Maryland > Prince George's County > Greenbelt (0.04)
- North America > United States > Maryland > Baltimore County > Towson (0.04)
- (4 more...)
- Energy (0.95)
- Government > Regional Government > North America Government > United States Government (0.46)
An Overview of MLCommons Cloud Mask Benchmark: Related Research and Data
von Laszewski, Gregor, Gu, Ruochen
Cloud masking is a crucial task that is well-motivated for meteorology and its applications in environmental and atmospheric sciences. Its goal is, given satellite images, to accurately generate cloud masks that identify each pixel in image to contain either cloud or clear sky. In this paper, we summarize some of the ongoing research activities in cloud masking, with a focus on the research and benchmark currently conducted in MLCommons Science Working Group. This overview is produced with the hope that others will have an easier time getting started and collaborate on the activities related to MLCommons Cloud Mask Benchmark.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.15)
- North America > United States > Missouri > St. Louis County > St. Louis (0.04)
- North America > United States > New York (0.04)
- (5 more...)