Diverse Image Captioning with Context-Object Split Latent Spaces

Neural Information Processing Systems 

Figure 1: Context-object split latent space of our COS-CV AE to exploit similarities in the contextual annotations for diverse captioning.