Bobra, Monica G.
A Machine-Learning-Ready Dataset Prepared from the Solar and Heliospheric Observatory Mission
Shneider, Carl, Hu, Andong, Tiwari, Ajay K., Bobra, Monica G., Battams, Karl, Teunissen, Jannis, Camporeale, Enrico
We present a Python tool to generate a standard dataset from solar images that allows for user-defined selection criteria and a range of pre-processing steps. Our Python tool works with all image products from both the Solar and Heliospheric Observatory (SoHO) and Solar Dynamics Observatory (SDO) missions. We discuss a dataset produced from the SoHO mission's multi-spectral images which is free of missing or corrupt data as well as planetary transits in coronagraph images, and is temporally synced making it ready for input to a machine learning system. Machine-learning-ready images are a valuable resource for the community because they can be used, for example, for forecasting space weather parameters. We illustrate the use of this data with a 3-5 day-ahead forecast of the north-south component of the interplanetary magnetic field (IMF) observed at Lagrange point one (L1). For this use case, we apply a deep convolutional neural network (CNN) to a subset of the full SoHO dataset and compare with baseline results from a Gaussian Naive Bayes classifier.
A Machine Learning Dataset Prepared From the NASA Solar Dynamics Observatory Mission
Galvez, Richard, Fouhey, David F., Jin, Meng, Szenicer, Alexandre, Muñoz-Jaramillo, Andrés, Cheung, Mark C. M., Wright, Paul J., Bobra, Monica G., Liu, Yang, Mason, James, Thomas, Rajat
In this paper we present a curated dataset from the NASA Solar Dynamics Observatory (SDO) mission in a format suitable for machine learning research. Beginning from level 1 scientific products we have processed various instrumental corrections, downsampled to manageable spatial and temporal resolutions, and synchronized observations spatially and temporally. We illustrate the use of this dataset with two example applications: forecasting future EVE irradiance from present EVE irradiance and translating HMI observations into AIA observations. For each application we provide metrics and baselines for future model comparison. We anticipate this curated dataset will facilitate machine learning research in heliophysics and the physical sciences generally, increasing the scientific return of the SDO mission. This work is a direct result of the 2018 NASA Frontier Development Laboratory Program. Please see the appendix for access to the dataset.