Goto

Collaborating Authors

 cloud type


SGMAGNet: A Baseline Model for 3D Cloud Phase Structure Reconstruction on a New Passive Active Satellite Benchmark

arXiv.org Artificial Intelligence

Cloud phase profiles are critical for numerical weather prediction (NWP), as they directly affect radiative transfer and precipitation processes. In this study, we present a benchmark dataset and a baseline framework for transforming multimodal satellite observations into detailed 3D cloud phase structures, aiming toward operational cloud phase profile retrieval and future integration with NWP systems to improve cloud microphysics parameterization. The multimodal observations consist of (1) high--spatiotemporal--resolution, multi-band visible (VIS) and thermal infrared (TIR) imagery from geostationary satellites, and (2) accurate vertical cloud phase profiles from spaceborne lidar (CALIOP\slash CALIPSO) and radar (CPR\slash CloudSat). The dataset consists of synchronized image--profile pairs across diverse cloud regimes, defining a supervised learning task: given VIS/TIR patches, predict the corresponding 3D cloud phase structure. We adopt SGMAGNet as the main model and compare it with several baseline architectures, including UNet variants and SegNet, all designed to capture multi-scale spatial patterns. Model performance is evaluated using standard classification metrics, including Precision, Recall, F1-score, and IoU. The results demonstrate that SGMAGNet achieves superior performance in cloud phase reconstruction, particularly in complex multi-layer and boundary transition regions. Quantitatively, SGMAGNet attains a Precision of 0.922, Recall of 0.858, F1-score of 0.763, and an IoU of 0.617, significantly outperforming all baselines across these key metrics.


Learning-Based Planning for Improving Science Return of Earth Observation Satellites

arXiv.org Artificial Intelligence

Earth observing satellites are powerful tools for collecting scientific information about our planet, however they have limitations: they cannot easily deviate from their orbital trajectories, their sensors have a limited field of view, and pointing and operating these sensors can take a large amount of the spacecraft's resources. It is important for these satellites to optimize the data they collect and include only the most important or informative measurements. Dynamic targeting is an emerging concept in which satellite resources and data from a lookahead instrument are used to intelligently reconfigure and point a primary instrument. Simulation studies have shown that dynamic targeting increases the amount of scientific information gathered versus conventional sampling strategies. In this work, we present two different learning-based approaches to dynamic targeting, using reinforcement and imitation learning, respectively. These learning methods build on a dynamic programming solution to plan a sequence of sampling locations. We evaluate our approaches against existing heuristic methods for dynamic targeting, showing the benefits of using learning for this application. Imitation learning performs on average 10.0\% better than the best heuristic method, while reinforcement learning performs on average 13.7\% better. We also show that both learning methods can be trained effectively with relatively small amounts of data.


3D Cloud reconstruction through geospatially-aware Masked Autoencoders

arXiv.org Artificial Intelligence

Clouds play a key role in Earth's radiation balance with complex effects that introduce large uncertainties into climate models. Real-time 3D cloud data is essential for improving climate predictions. This study leverages geostationary imagery from MSG/SEVIRI and radar reflectivity measurements of cloud profiles from CloudSat/CPR to reconstruct 3D cloud structures. We first apply self-supervised learning (SSL) methods-Masked Autoencoders (MAE) and geospatially-aware SatMAE on unlabelled MSG images, and then fine-tune our models on matched image-profile pairs. Our approach outperforms state-of-the-art methods like U-Nets, and our geospatial encoding further improves prediction results, demonstrating the potential of SSL for cloud reconstruction.


CloudSense: A Model for Cloud Type Identification using Machine Learning from Radar data

arXiv.org Artificial Intelligence

The knowledge of type of precipitating cloud is crucial for radar based quantitative estimates of precipitation. We propose a novel model called CloudSense which uses machine learning to accurately identify the type of precipitating clouds over the complex terrain locations in the Western Ghats (WGs) of India. CloudSense uses vertical reflectivity profiles collected during July-August 2018 from an X-band radar to classify clouds into four categories namely stratiform,mixed stratiform-convective,convective and shallow clouds. The machine learning(ML) model used in CloudSense was trained using a dataset balanced by Synthetic Minority Oversampling Technique (SMOTE), with features selected based on physical characteristics relevant to different cloud types. Among various ML models evaluated Light Gradient Boosting Machine (LightGBM) demonstrate superior performance in classifying cloud types with a BAC of 0.8 and F1-Score of 0.82. CloudSense generated results are also compared against conventional radar algorithms and we find that CloudSense performs better than radar algorithms. For 200 samples tested, the radar algorithm achieved a BAC of 0.69 and F1-Score of 0.68, whereas CloudSense achieved a BAC and F1-Score of 0.77. Our results show that ML based approach can provide more accurate cloud detection and classification which would be useful to improve precipitation estimates over the complex terrain of the WG.


A Machine Learning Paradigm for Studying Pictorial Realism: Are Constable's Clouds More Real than His Contemporaries?

arXiv.org Artificial Intelligence

The British landscape painter John Constable is considered foundational for the Realist movement in 19th-century European painting. Constable's painted skies, in particular, were seen as remarkably accurate by his contemporaries, an impression shared by many viewers today. Yet, assessing the accuracy of realist paintings like Constable's is subjective or intuitive, even for professional art historians, making it difficult to say with certainty what set Constable's skies apart from those of his contemporaries. Our goal is to contribute to a more objective understanding of Constable's realism. We propose a new machine-learning-based paradigm for studying pictorial realism in an explainable way. Our framework assesses realism by measuring the similarity between clouds painted by artists noted for their skies, like Constable, and photographs of clouds. The experimental results of cloud classification show that Constable approximates more consistently than his contemporaries the formal features of actual clouds in his paintings. The study, as a novel interdisciplinary approach that combines computer vision and machine learning, meteorology, and art history, is a springboard for broader and deeper analyses of pictorial realism.


A labeled dataset of cloud types using data from GOES-16 and CloudSat

arXiv.org Artificial Intelligence

In this paper we present the development of a dataset consisting of 91 Multi-band Cloud and Moisture Product Full-Disk (MCMIPF) from the Advanced Baseline Imager (ABI) on board GOES-16 geostationary satellite with 91 temporally and spatially corresponding CLDCLASS products from the CloudSat polar satellite. The products are diurnal, corresponding to the months of January and February 2019 and were chosen such that the products from both satellites can be co-located over South America. The CLDCLASS product provides the cloud type observed for each of the orbit's steps and the GOES-16 multiband images contain pixels that can be co-located with these data. We develop an algorithm that returns a product in the form of a table that provides pixels from multiband images labelled with the type of cloud observed in them. These labelled data conformed in this particular structure are very useful to perform supervised learning. This was corroborated by training a simple linear artificial neural network based on the work of Gorooh et al. (2020), which gave good results, especially for the classification of deep convective clouds.


Cumulo: A Dataset for Learning Cloud Classes

arXiv.org Machine Learning

One of the greatest sources of uncertainty in future climate projections comes from limitations in modelling clouds and in understanding how different cloud types interact with the climate system. A key first step in reducing this uncertainty is to accurately classify cloud types at high spatial and temporal resolution. In this paper, we introduce Cumulo, a benchmark dataset for training and evaluating global cloud classification models. It consists of one year of 1km resolution MODIS hyperspectral imagery merged with pixel-width 'tracks' of CloudSat cloud labels. Bringing these complementary datasets together is a crucial first step, enabling the Machine-Learning community to develop innovative new techniques which could greatly benefit the Climate community. To showcase Cumulo, we provide baseline performance analysis using an invertible flow generative model (IResNet), which further allows us to discover new sub-classes for a given cloud class by exploring the latent space. To compare methods, we introduce a set of evaluation criteria, to identify models that are not only accurate, but also physically-realistic.