Learning Consistent Deep Generative Models from Sparse Data via Prediction Constraints
Hope, Gabriel, Abdrakhmanova, Madina, Chen, Xiaoyin, Hughes, Michael C., Hughes, Michael C., Sudderth, Erik B.
We develop a new framework for learning variational autoencoders and other deep generative models that balances generative and discriminative goals. Our framework optimizes model parameters to maximize a variational lower bound on the likelihood of observed data, subject to a task-specific prediction constraint that prevents model misspecification from leading to inaccurate predictions. We further enforce a consistency constraint, derived naturally from the generative model, that requires predictions on reconstructed data to match those on the original data. We show that these two contributions - prediction constraints and consistency constraints - lead to promising image classification performance, especially in the semi-supervised scenario where category labels are sparse but unlabeled data is plentiful. Our approach enables advances in generative modeling to directly boost semi-supervised classification performance, an ability we demonstrate by augmenting deep generative models with latent variables capturing spatial transformations. We develop broadly applicable methods for learning flexible models of high-dimensional data, like images, that are paired with (discrete or continuous) labels. We are particularly interested in semisupervised learning (Zhu, 2005; Oliver et al., 2018) from data that is sparsely labeled, a common situation in practice due to the cost or privacy concerns associated with data annotation. Given a large and sparsely labeled dataset, we seek a single probabilistic model that simultaneously makes good predictions of labels and provides a high-quality generative model of the high-dimensional input data. Strong generative models are valuable because they can allow incorporation of domain knowledge, can address partially missing or corrupted data, and can be visualized to improve interpretability. Prior approaches for the semi-supervised learning of deep generative models include methods based on variational autoencoders (VAEs) (Kingma et al., 2014; Siddharth et al., 2017), generative adversarial networks (GANs) (Dumoulin et al., 2017; Kumar et al., 2017), and hybrids of the two (Larsen et al., 2016; de Bem et al., 2018; Zhang et al., 2019). While these all allow sampling of data, a major shortcoming of these approaches is that they do not adequately use labels to inform the generative model.
Dec-11-2020
- Country:
- North America > United States > Wisconsin (0.14)
- Genre:
- Research Report > New Finding (0.46)
- Technology: