imagine and create
Learn, Imagine and Create: Text-to-Image Generation from Prior Knowledge
Text-to-image generation, i.e. generating an image given a text description, is a very challenging task due to the significant semantic gap between the two domains. Humans, however, tackle this problem intelligently. We learn from diverse objects to form a solid prior about semantics, textures, colors, shapes, and layouts. Given a text description, we immediately imagine an overall visual impression using this prior and, based on this, we draw a picture by progressively adding more and more details. In this paper, and inspired by this process, we propose a novel text-to-image method called LeicaGAN to combine the above three phases in a unified framework.
Reviews: Learn, Imagine and Create: Text-to-Image Generation from Prior Knowledge
Quality The paper is thorough in describing the method and supporting the proposed method with experiments Clarity The paper is well written and easy to follow. Originality & Significance Although the method is not very novel in light of Paper 1253: RecreateGAN (see more below), the experimental exploration of different settings of the method is thoroughly done and interesting. The idea of matching local image features to word-level embeddings and matching global image features to sentence level embeddings is intuitive and makes sense. This paper shares significant parts of the method with Paper 1253: RecreateGAN, in particular, the textual-visual embedding loss in (6) of this paper matches the pairwise loss defined in eq (5) of the other paper. However, this paper uses this component as part of a different method, namely for textual-visual embedding vs an image similarity embedding. Additionally, the cascade of attentional generators in this papers is very similar between both papers.
Learn, Imagine and Create: Text-to-Image Generation from Prior Knowledge
Text-to-image generation, i.e. generating an image given a text description, is a very challenging task due to the significant semantic gap between the two domains. Humans, however, tackle this problem intelligently. We learn from diverse objects to form a solid prior about semantics, textures, colors, shapes, and layouts. Given a text description, we immediately imagine an overall visual impression using this prior and, based on this, we draw a picture by progressively adding more and more details. In this paper, and inspired by this process, we propose a novel text-to-image method called LeicaGAN to combine the above three phases in a unified framework.
Learn, Imagine and Create: Text-to-Image Generation from Prior Knowledge
Qiao, Tingting, Zhang, Jing, Xu, Duanqing, Tao, Dacheng
Text-to-image generation, i.e. generating an image given a text description, is a very challenging task due to the significant semantic gap between the two domains. Humans, however, tackle this problem intelligently. We learn from diverse objects to form a solid prior about semantics, textures, colors, shapes, and layouts. Given a text description, we immediately imagine an overall visual impression using this prior and, based on this, we draw a picture by progressively adding more and more details. In this paper, and inspired by this process, we propose a novel text-to-image method called LeicaGAN to combine the above three phases in a unified framework.