white cat
Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space
Liwei Wang, Alexander Schwing, Svetlana Lazebnik
This paper explores image caption generation using conditional variational autoencoders (CVAEs). Standard CVAEs with a fixed Gaussian prior yield descriptions with too little variability. Instead, we propose two models that explicitly structure the latent space around K components corresponding to different types of image content, and combine components to create priors for images that contain multiple types of content simultaneously (e.g., several kinds of objects). Our first model uses a Gaussian Mixture model (GMM) prior, while the second one defines a novel Additive Gaussian (AG) prior that linearly combines component means. We show that both models produce captions that are more diverse and more accurate than a strong LSTM baseline or a "vanilla" CVAE with a fixed Gaussian prior, with AG-CVAE showing particular promise.
- North America > United States > Illinois (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
The Genius Author Who Turns Fairy Tales Inside Out
In the 22 years since the publication of her first story collection, Stranger Things Happen, Kelly Link's fiction has crept from the status of cult favorite to something approaching the mainstream--or, rather, the mainstream has crept toward her. Link has never written a novel, only short stories (although a novel has been promised for next year), and her first two books were published by the small press she operates with her husband, Gavin Grant. Furthermore, she writes in genres once regarded as peripheral: fantasy and (occasionally) science fiction. None of this has been considered conducive to literary fame, but times have changed. Novelists ranging from Michael Chabon (a big Link fan) to Kate Atkinson have dissolved many of the boundaries between genre fiction and the mainstream.
- North America > United States > Colorado (0.05)
- North America > United States > Vermont (0.05)
- North America > United States > New York (0.05)
- North America > United States > Connecticut (0.05)
- Health & Medicine > Therapeutic Area (0.49)
- Media (0.35)
Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space
Wang, Liwei, Schwing, Alexander, Lazebnik, Svetlana
This paper explores image caption generation using conditional variational auto-encoders (CVAEs). Standard CVAEs with a fixed Gaussian prior yield descriptions with too little variability. Instead, we propose two models that explicitly structure the latent space around K components corresponding to different types of image content, and combine components to create priors for images that contain multiple types of content simultaneously (e.g., several kinds of objects). Our first model uses a Gaussian Mixture model (GMM) prior, while the second one defines a novel Additive Gaussian (AG) prior that linearly combines component means. We show that both models produce captions that are more diverse and more accurate than a strong LSTM baseline or a “vanilla” CVAE with a fixed Gaussian prior, with AG-CVAE showing particular promise.
- North America > United States > Illinois (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)