Learning Structured Output Representation using Deep Conditional Generative Models
Kihyuk Sohn, Honglak Lee, Xinchen Yan
–Neural Information Processing Systems
Supervised deep learning has been successfully applied to many recognition problems. Although it can approximate a complex many-to-one function well when a large amount of training data is provided, it is still challenging to model complex structured output representations that effectively perform probabilistic inference and make diverse predictions. In this work, we develop a deep conditional generative model for structured output prediction using Gaussian latent variables. The model is trained efficiently in the framework of stochastic gradient varia-tional Bayes, and allows for fast prediction using stochastic feed-forward inference. In addition, we provide novel strategies to build robust structured prediction algorithms, such as input noise-injection and multi-scale prediction objective at training. In experiments, we demonstrate the effectiveness of our proposed algorithm in comparison to the deterministic deep neural network counterparts in generating diverse but realistic structured output predictions using stochastic inference. Furthermore, the proposed training methods are complimentary, which leads to strong pixel-level object segmentation and semantic labeling performance on Caltech-UCSD Birds 200 and the subset of Labeled Faces in the Wild dataset.
Neural Information Processing Systems
Oct-2-2025, 10:43:22 GMT
- Country:
- North America > United States
- California (0.04)
- Massachusetts > Hampshire County
- Amherst (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- North America > United States
- Genre:
- Research Report
- Experimental Study (0.46)
- New Finding (0.46)
- Research Report
- Technology: