Goto

Collaborating Authors

 Generative AI


DALL·E 2 Artificial Intelligence System Has Created its Own Secret Language Nobody Understands

#artificialintelligence

Photo credit: The Sun Researcher and Computer Science PhD student Giannis Daras has uncovered a secret language that DALL·E 2, a cutting edge text-to-image artificial intelligence generator, has created. It's believed that DALL·E 2 makes up its own words to make sense of the images it generates. Daras then fed these words back to the system and apparently, the AI understood exactly what it was reading. Daras thinks that this is a big security hole for the text-to-image generator, as it could prompt backdoor adversarial attacks or provide ways to circumvent filter. As of now, Natural Language Processing systems filter text prompts that violate the policy rules and gibberish prompts may be eventually used by attackers to bypass these filters.


Best Practices for Deploying Language Models

#artificialintelligence

Cohere, OpenAI, and AI21 Labs have developed a preliminary set of best practices applicable to any organization developing or deploying large language models. Computers that can read and write are here, and they have the potential to fundamentally impact daily life. The future of human–machine interaction is full of possibility and promise, but any powerful technology needs careful deployment. The joint statement below represents a step towards building a community to address the global challenges presented by AI progress, and we encourage other organizations who would like to participate to get in touch. We're recommending several key principles to help providers of large language models (LLMs) mitigate the risks of this technology in order to achieve its full promise to augment human capabilities.


Google's Imagen Text-to-Image Diffusion Model With Deep Language Understanding Defeats DALL-E 2

#artificialintelligence

Text-to-image diffusion models that can generate and edit photorealistic images have become a hot AI research area, with their incredible synthetic images garnering widespread mainstream media coverage. An advanced image generation approach, diffusion models have surpassed previous high-performance methods such as GANs (generative adversarial networks) in both image fidelity and diversity and are now demonstrating their potential in text-to-image generation. In the new paper Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding, a Google Brain research team advances this research field with Imagen, a text-to-image diffusion model that combines the deep language understanding of transformer-based large language models and the photorealistic image generation capabilities of diffusion models to achieve a new state-of-the-art FID score of 7.27 on the COCO dataset. Imagen's training data was drawn from massive datasets of image and English alt-text pairs. Like previous text-to-image models, Imagen's "wow" factor lies in its ability to generate photorealistic and high-resolution images from fanciful prompts such as "A cute corgi lives in a house made out of sushi" or "A dragon fruit wearing a karate belt in the snow."


Who's liable for AI-generated lies? – TechCrunch

#artificialintelligence

Who will be liable for harmful speech generated by large language models? As advanced AIs such as OpenAI's GPT-3 are being cheered for impressive breakthroughs in natural language processing and generation -- and all sorts of (productive) applications for the tech are envisaged from slicker copywriting to more capable customer service chatbots -- the risks of such powerful text-generating tools inadvertently automating abuse and spreading smears can't be ignored. Nor can the risk of bad actors intentionally weaponizing the tech to spread chaos, scale harm and watch the world burn. Indeed, OpenAI is concerned enough about the risks of its models going "totally off the rails", as its documentation puts it at one point (in reference to a response example in which an abusive customer input is met with a very troll-esque AI reply), to offer a free content filter that "aims to detect generated text that could be sensitive or unsafe coming from the API" -- and to recommend that users don't return any generated text that the filter deems "unsafe". But, given the novel nature of the technology, there are no clear legal requirements that content filters must be applied.


OpenAI's Pedagogical Method for Interpretable Machine Learning

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. A paper that tries to lay down a pedagogical foundation for the understanding of how neural networks make decisions.


Googles Imagen Model Better than DALLE-2?

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. If you thought Dall-E 2 delivered impressive results, wait till you see what this latest Google Brain model can accomplish.


Google Imagen vs OpenAI DALL·E 2

#artificialintelligence

It is NOT a great time for OpenAI right now. It's been just over a month since DALL·E 2 was released and just a few days ago, Google decides to enter the ring with Imagen. In comparison, Imagen is a slap in the face for DALLE·2 mainly because it outperforms DALLE·2 in terms of AI Image generation precision and quality. If by now you are wondering WTF is ImageGen and DALLE·2, how does this technology work in simple lingo, as well as what makes Google's Imagen so superior then take a walk with me through this article as we shall discover together my amigo. So both technologies in simple terms, allow you to generate images from text.


DALL·E 2, an AI system that revolutionized how we perceive arts

#artificialintelligence

Art is a form of expression. Its message may be symbolic or religious, historical or political. Art is there to elicit an emotional response, and to'move' us. Artificial Intelligence, or AI for short, is the simulation of human intelligence processed by machines. Its original purpose is there to replace routine jobs and repetitive tasks like separation and segregation of materials.


OpenAI and the road to text-guided image generation: DALL·E, CLIP, GLIDE, DALL·E 2 (unCLIP)

#artificialintelligence

The first version of DALL·E was a GPT-3 style transformer decoder that autoregressively generated a 256 256 image based on textual input and an optional beginning of the image. If you want to understand how a GPT-like transformer works, here is a great visual explanation by Jay Alammar. A text is encoded by BPE-tokens (max. Because of the dVAE, some details and high-frequency features are lost in generated images, so some blurriness and smoothness are the features of the DALL·E-generated images. The transformer is a large model with 12B parameters. It consisted of 64 sparse transformer blocks with a complicated set of attention mechanisms inside, consisting of 1) classical text-to-text masked attention, 2) image-to-text attention, and 3) image-to-image sparse attention.


Horse rides astronaut

#artificialintelligence

"In the past few years, our tolerance of sloppy thinking has led us to repeat many mistakes over and over. If we are to retain any credibility, this should stop. It is hard to say where [we] have gone wronger, in underestimating language or overestimating computer programs." In April, Open AI released a neural network model called DALL-E 2 that blew people's minds; last week a new model came out from Google Brain called Imagen, and it was even better. Both turn sentences into art, and even a hardened skeptic like myself can't help but be amazed.