VAEs in the Presence of Missing Data
Collier, Mark, Nazabal, Alfredo, Williams, Christopher K. I.
Real world datasets often contain entries with Existing approaches which adapt VAEs to datasets with missing elements e.g. in a medical dataset, a patient missing data (Vedantam et al., 2017; Nazabal et al., 2018; is unlikely to have taken all possible diagnostic Mattei & Frellsen, 2019; Ma et al., 2019) suffer from a number tests. Variational Autoencoders (VAEs) are of significant disadvantages, including 1) not handling popular generative models often used for unsupervised missing not at random (MNAR) data, 2) replacing missing learning. Despite their widespread use elements with zeros with no way to distinguish an observed it is unclear how best to apply VAEs to datasets data element with value zero from a missing element, 3) with missing data. We develop a novel latent not scaling to high dimensional inputs and/or 4) restricting variable model of a corruption process which the types of neural network architectures permitted, these generates missing data, and derive a corresponding issues are discussed in detail below. We aim to improve tractable evidence lower bound (ELBO). Our upon the handling of missing data by VAEs by addressing model is straightforward to implement, can handle the disadvantages of the existing approaches. In particular both missing completely at random (MCAR) and we propose a novel latent variable probabilistic model of missing not at random (MNAR) data, scales to missing data as the result of a corruption process, and derive high dimensional inputs and gives both the VAE a tractable ELBO for our proposed model.
Jul-13-2020