DALL·E 2, Explained: The Promise And Limitations Of A Revolutionary AI

#artificialintelligence 

DALL·E 2 is the newest AI model by OpenAI. If you've seen some of its creations and think they're amazing, keep reading to understand why you're totally right -- but also wrong. OpenAI published a blog post and a paper entitled "Hierarchical Text-Conditional Image Generation with CLIP Latents" on DALL·E 2. The post is fine if you want to get a glimpse at the results and the paper is great for understanding the technical details, but neither explains DALL·E 2's amazingness -- and the not-so-amazing -- in depth. That's what this article is for. If this in-depth educational content is useful for you, subscribe to our AI mailing list to be alerted when we release new material. DALL·E 2 is the new version of DALL·E, a generative language model that takes sentences and creates corresponding original images. At 3.5B parameters, DALL·E 2 is a large model but not nearly as large as GPT-3 and, interestingly, smaller than its predecessor (12B). Despite its size, DALL·E 2 generates 4x better resolution images than DALL·E and it's preferred by human judges 70% of the time both in caption matching and photorealism. As they did with DALL·E, OpenAI didn't release DALL·E 2 (you can always join the never-ending waitlist). However, they open-sourced CLIP which, although only indirectly related to DALL·E, forms the basis of DALL·E 2. (CLIP is also the basis of the apps and notebooks people who can't access DALL·E 2 are using.)

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found