Self-Supervised Representation Learning for Astronomical Images
Hayat, Md Abul, Stein, George, Harrington, Peter, Lukić, Zarija, Mustafa, Mustafa
–arXiv.org Artificial Intelligence
Submitted to The Astrophysical Journal Letters ABSTRACT Sky surveys are the largest data generators in astronomy, making automated tools for extracting meaningful scientific information an absolute necessity. We show that, without the need for labels, self-supervised learning recovers representations of sky survey images that are semantically useful for a variety of scientific tasks. These representations can be directly used as features, or fine-tuned, to outperform supervised methods trained only on labeled data. We apply a contrastive learning framework on multi-band galaxy photometry from the Sloan Digital Sky Survey (SDSS), to learn image representations. We then use them for galaxy morphology classification, and fine-tune them for photometric redshift estimation, using labels from the Galaxy Zoo 2 dataset and SDSS spectroscopy. In both downstream tasks, using the same learned representations, we outperform the supervised stateof-the-art results, and we show that our approach can achieve the accuracy of supervised models while using 2-4 times fewer labels for training. INTRODUCTION the quantity and quality of (manually assigned) image labels. Observing and imaging objects in the sky has been Serendipitous discovery of an ionization echo from a the main driver of the scientific discovery process in astronomy, recently faded quasar (Lintott et al. 2009), and the cumbersome because doing controlled experiments is not a search for similar systems that followed (Keel viable option. It in the 1990s, spearheaded by SDSS (Gunn et al. 1998, demonstrates the need for methods which allow for the 2006), has rendered obsolete the approach of manual discovery of truly unusual and previously unseen objects, inspection of images by an expert.
arXiv.org Artificial Intelligence
Dec-23-2020