dalle-2
DALLE-2 is Seeing Double: Flaws in Word-to-Concept Mapping in Text2Image Models
Rassin, Royi, Ravfogel, Shauli, Goldberg, Yoav
We study the way DALLE-2 maps symbols (words) in the prompt to their references (entities or properties of entities in the generated image). We show that in stark contrast to the way human process language, DALLE-2 does not follow the constraint that each word has a single role in the interpretation, and sometimes re-use the same symbol for different purposes. We collect a set of stimuli that reflect the phenomenon: we show that DALLE-2 depicts both senses of nouns with multiple senses at once; and that a given word can modify the properties of two distinct entities in the image, or can be depicted as one object and also modify the properties of another object, creating a semantic leakage of properties between entities. Taken together, our study highlights the differences between DALLE-2 and human language processing and opens an avenue for future study on the inductive biases of text-to-image models.
Artificial Intelligence, Genuine Nightmares
I distinctly remember the moment I was first intrigued by the possibilities of AI Art. On December 13th, 2021 I was scrolling through Twitter and breaking up the endless parade of apocalyptic news stories, vicious wisecracks and obscure movie trivia fighting for space on my feed was a puzzling and bizarre comic-grotesquerie "Elon Musk Dying in Space," created by one @wagface. No malice towards Mr. Musk, but one thing I love about Twitter is how the rich and poor, the powerful and anonymous are all trapped together in the same canvas sack, and this was a striking example of an anonymous online wise-guy turning the tools of cutting edge technology against one of the world's most pre-eminent technologists. I was also totally baffled, "What, exactly, am I looking at?" The image's melted plastic, smeared pixel future-primitive aesthetic couldn't've been more appropriate towards the subject at hand, but how was this nightmare created and what else is it capable of creating?
- Information Technology > Services (0.54)
- Media > News (0.34)
Where is 'I' in 'AI' anymore?
Last month, a group of Cosmopolitan editors, alongside digital artist Karen X. Cheng and members of artificial intelligence research lab OpenAI, created the first-ever magazine cover designed by artificial intelligence. This is the first-ever magazine cover generated using DALLE-2. Words I never thought I'd be saying? An image I generated is the cover of @cosmopolitan for their first ever AI-generated magazine cover #dalle #dalle2 pic.twitter.com/x2oqiNMRVx Recently, OpenAI's GPT-3 also published a research thesis on itself.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.84)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.84)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.66)
Text To Image AI Has Created Its Own Secret Language, Researcher Claims
Here's something reassuring to think about: researchers using machine-learning artificial intelligence (AI) often don't know precisely how their algorithms are solving the problems they are tasked with. Take for instance the AI that can identify race from X-rays where no human can see how, or the Facebook AI that began to develop its own language. Joining these may be everyone's favorite text-to-image generator, DALLE-2. Computer Science PhD student Giannis Daras noticed that the DALLE-2 system, which creates images based on a text input prompt, would return nonsense words as text under certain circumstances. "A known limitation of DALLE-2 is that it struggles with text," he wrote in a paper published on pre-print server Arxiv.
Discovering the Hidden Vocabulary of DALLE-2 - Technology Org
DALLE-2 is a deep generative model that takes a text caption and generates images that match the given text. However, it has its limitations. Sample image generated using DALL·E 2. Image credit: OpenAI Researchers discover that this text is not random but reveals a hidden vocabulary that the model seems to have developed internally. Researchers find that words that sound gibberish for humans may have a meaning for DALLE-2; for example, Vicootes means vegetables. Researchers notice that a system behaving in unpredictable ways may cause security concerns.
Googles Imagen Model Better than DALLE-2?
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. If you thought Dall-E 2 delivered impressive results, wait till you see what this latest Google Brain model can accomplish.