Generative AI
Newsletter July 2022 2 -- Deep Learning
This newsletter is all about deep learning, data science, machine learning, natural language processing (NLP) -- our core contents. We are taking one newsletter break from blockchain and interviews. Don't worry, they will be out again later this month. There are some cool new AI kids on the block: DALL-E 2 (openAI) and Imagen (Google). First, read our warning below.
How OpenAI Reduces risks for DALL·E 2
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.
How OpenAI Reduces risks for DALL·E 2
You've all seen amazing-looking images like these, entirely generated by an artificial intelligence model. I covered multiple approaches on my channel, like Craiyon, Imagen, and the most well-known, Dall-e 2. Most people want to try them and generate images from random prompts, but the majority of these models aren't open-source, which means we, regular people like us, cannot use them freely. This is what we will dive into in this article. I said most of them were not open-source. Well, Craiyon is, and people have generated amazing memes using it.
A First Look at DALL-E 2 -- How It Works Under the Hood
Dall-E 2 is the successor to Open AI's Dall-E model. The name Dall-E is the portmanteau of Wall-E (a sci-fi film by Pixar) and Salvador Dalí (a Spanish artist renowned for his surrealistic style in his paintings). The model is used to generate photorealistic images from a given text description. The model is not made available to the public yet but the Open AI team has made a nice demo on their website. As you can see, these images are what an artist/graphical designer will take hours if not days to produce but DALL-E2 does it in a matter of minutes and the images it produces are so impressive.
Meta's latest generative AI system creates stunning images from sketches and text - SiliconANGLE
Meta Platforms Inc. today unveiled an advanced "generative artificial intelligence system" that's designed to help artists better showcase their creativity. The system, called "Make-A-Scene," is meant to demonstrate how AI has the potential to empower anyone to bring their imagination to life. The user can simply describe and illustrate their vision through a combination of text descriptions and freeform sketches, and the AI will come up with a stunning representation of it. As the company explains in a blog post, generative AI is already used by a number of artists to augment their creativity. Examples include expressive avatars, animating children's drawings, creating virtual worlds in the metaverse and producing digital artworks using only text-based descriptions.
Techniques for Training Large Neural Networks
Large neural networks are at the core of many recent advances in AI, but training them is a difficult engineering and research challenge which requires orchestrating a cluster of GPUs to perform a single synchronized calculation. As cluster and model sizes have grown, machine learning practitioners have developed an increasing variety of techniques to parallelize model training over many GPUs. At first glance, understanding these parallelism techniques may seem daunting, but with only a few assumptions about the structure of the computation these techniques become much more clear--at that point, you're just shuttling around opaque bits from A to B like a network switch shuttles around packets. Each color refers to one layer and dashed lines separate different GPUs. Training a neural network is an iterative process.
UX principles for AI art tools like DALL·E
It's incredible that a picture can be generated from a single text phrase and while many are satisfied to stop there, other creators want more control over the images. Even in these early days, there are already multiple ways to shape the AI's results: While neither a separate feature nor an integrated part of the interfaces, there are many independent guides and image tests that aim to provide style names and other key words that yield desired results. Midjourney also published some helpful prompting tips. When dealing with images, there are inevitable trade-offs between size, quality, and speed. Built on cloud-based systems, we're able to use hardware we wouldn't ordinarily have access to, enabling significantly faster image generation.
DALL-E true significance
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.
Colossal-AI, A Unified Deep Learning System for Big Models, Seamlessly Accelerates Large Models at Low Costs with Hugging Face
According to a Forbes article, large AI models are considered one of six AI trends to watch for in 2022. As large-scale AI models continue their superior performances across different domains, trends emerge, leading to distinguished and efficient AI applications that have never been seen in the industry. For example, Microsoft-owned GitHub and OpenAI partnered to launch Copilot recently. Copilot plays the role of an AI pair programmer, offering suggestions for code and entire functions in real-time. Such developments continue to make coding easier than before. Another example released by OpenAI, DALL-E 2, is a powerful tool that creates original and realistic images as well as art from only simple text.
Meta's 'Make-A-Scene' AI blends human and computer imagination into algorithmic art
Text-to-image generation is the hot algorithmic process right now, with OpenAI's Craiyon (formerly DALL-E mini) and Google's Imagen AIs unleashing tidal waves of wonderfully weird procedurally generated art synthesized from human and computer imaginations. On Tuesday, Meta revealed that it too has developed an AI image generation engine, one that it hopes will help to build immersive worlds in the Metaverse and create high digital art. A lot of work into creating an image based on just the phrase, "there's a horse in the hospital," when using a generation AI. First the phrase itself is fed through a transformer model, a neural network that parses the words of the sentence and develops a contextual understanding of their relationship to one another. Once it gets the gist of what the user is describing, the AI will synthesize a new image using a set of GANs (generative adversarial networks).