A method based on Generative Adversarial Networks for disentangling physical and chemical properties of stars in astronomical spectra

Santoveña, Raúl, Dafonte, Carlos, Manteiga, Minia

arXiv.org Artificial Intelligence 

Data compression techniques that are focused on information preservation have become essential in the modern era of big data. In this work, an encoderdecoder architecture has been designed where adversaria! The goal of this proposal is to obtain an intermediate representation of the astronomical stellar spectra, in which the contribution to the flux of a star due to the most influential physical properties (its surface temperature and gravity) disappears and the variance reflects only the effect of the chemical composition over the spectrum. We apply a scheme of deep learning with the aim of unraveling in the latent space the desired parameters of the rest of the information contained in the data. This work propases a version of adversaria! training that makes use of one discriminator per parameter to be disentangled, thus avoiding the exponential combination that occurs in the use of a single discriminator, as a result of the discretization of the values to be untangled. To test the effectiveness of the method, synthetic astronomical data are used from the APOGEE and Gaia surveys. In conjunction with the work presented, an ad-hoc framework (GANDALF) is provided, which allows the replication, visualization, and extension of the method to domains of any nature. Keywords: Generative Adversaria! Neural Networks, Disentangled Representation, Astronomical Spectra, Gaia mission, APOGEE Preprint submitted to Applied Soft Computing November 8, 2024 l. Introduction Finding representations of the data that can ease the extraction of useful inforrnation and irnprove algorithrn performance in classification or pararnetrization problerns has becorne a field in itself in the rnachine learning cornrnunity, and is known as representation learning. The process of unraveling these underlying factors in a cornprehensive representation is called disentangled representation. There is abundant literature on the problern of how to decode or separate representations of a signal into projections that include only inforrnation relevant to a specific problern. In Wang et al. [1] the current state of the literature is exhaustively reviewed discussing different rnethodologies, rnetrics, rnodels, and applications.