Generative AI
The Importance of International Norms in Artificial Intelligence Ethics
DALL-E 2, an image-generating artificial intelligence (AI) has captured the public's attention with stunning portrayals of Godzilla-eating Tokyo and photorealistic images of astronauts riding horses in space. The model is the newest iteration of a text-to-image algorithm, an AI model that can generate images based on text descriptions. OpenAI, the company behind DALL-E 2, used a language model, GPT-3, and a computer vision model, CLIP, to train DALL-E 2 using 650 million images with associated text captions. The integration of these two models made it possible for OpenAI to train DALL-E 2 to generate a vast array of images in many different styles. Despite DALL-E 2's impressive accomplishments, there are significant issues with how the model portrays people and how it has acquired biases from the data it was trained on.
You can now sell your DALL-E 2 art, but it feels murky
OpenAI has promised to give one million subscribers on the DALL-E 2 waitlist access to the astonishing AI art generator, and now they can sell the images their prompts create. As those including LinkedIn founder and tech entrepreneur Reid Hoffman release their art for sale, this is raising even more questions about the new form of digital art. A product of the Elon Musk co-founded research lab OpenAI, DALL-E 2 is the most impressive of the AI art generators, and even the weirdest AI art from DALL-E-2 has made a splash. But this tech has always raised concerns, not least because AI art generators could threaten the jobs of, you know, real human artists. Now we're beginning to see the first art created by DALL-E 2, with input from users, go on sale.
Quick-fire Guide to Multi-Modal ML With OpenAI's CLIP
Contrastive Language-Image Pretraining (CLIP) consists of two models trained in parallel. During training, (image, text) pairs are fed into the respective models, and both output a 512-dimensional vector embedding that represents the respective image/text in vector space. The contrastive component takes these two vector embeddings and calculates the model loss as the difference (e.g., contrast) between the two vectors. Both models are then optimized to minimize this difference and therefore learn how to embed similar (image, text) pairs into a similar vector space. After this contrastive pretraining process, we are left with CLIP, a multi-modal model capable of understanding both language and images via a shared vector space.
New-and-Improved Content Moderation Tooling
We are introducing a new-and-improved content moderation tool: The Moderation endpoint improves upon our previous content filter, and is available for free today to OpenAI API developers. To help developers protect their applications against possible misuse, we are introducing the faster and more accurate Moderation endpoint. This endpoint provides OpenAI API developers with free access to GPT-based classifiers that detect undesired content -- an instance of using AI systems to assist with human supervision of these systems. We have also released both a technical paper describing our methodology and the dataset used for evaluation. When given a text input, the Moderation endpoint assesses whether the content is sexual, hateful, violent, or promotes self-harm -- content prohibited by our content policy.
Last Week in AI #176: Drones beat human pilots in first fair race, better call quality with AI, how artists view AI-generated art, and more!
A year ago researchers from the University of Zurich showcased their autonomous drones that were able to beat the fastest human pilots. However, that race wasn't "fair" in the sense that the AI algorithm commanding the drones had extra information that human pilots didn't have. In particular, the algorithm had access to near-perfect location and velocity estimation of the drones using motion capture systems, high-quality maps of the race course beforehand, and stereo cameras that can give depth information. This year, the team's autonomous drones raced on even playing fields without these handicaps, and its AI was able to beat the best human-controlled time by 0.5s in a three-lap race, a significant lead in the world of drone racing. Our take: This development is representative of AI progress ins many fields, where the researchers first make a working system with additional assumptions and then slowly chip away at these assumptions for a more robust and adaptable AI system.
MIT Researchers use OpenAI Codex to Build an An ML-based Mathematics Problem-generator
OpenAI Codex is one of the most powerful language-to-code GPT3-based neural networking platform for high-speed programming. OpeAI Codex is used in a large number of AI Machine Learning projects in a safe AGI environment. As the demand for Codex programmers increase in the current era, we are witnessing a large number of AI researchers also taking to OpenAI's GPT3 offering to improve their understanding of neural networks for complex problems. In one such development, a group of machine learning researchers and faculty members belonging to the MIT, Columbia University, Harvard University, and the University of Waterloo have built a machine learning algorithm using OpenAI Codex. This new algorithm can solve, explain and generate complex mathematical problems.
In Humanity's Collective Unconscious, the Body Is a Bad Dream
The first images that I tried to generate from Dall-E Mini were of cartoon characters getting colonoscopies. The website, now called Craiyon, houses an artificial intelligence model that turns any submitted string of text into pictures. I'm a practicing gastroenterologist who identifies as online, but not very, and the viral tweets that introduced me to the model also served as my yardstick for submissions that felt appropriately niche. Those early efforts bore mixed results: Daffy Duck was shown standing atop a stretcher, Porky Pig registered as an actual pig, and Bugs Bunny was inserted directly into a human colon, his gray ears bleeding into its pink folds. I've since defaulted to the curated experience of Twitter accounts like Weird Dall-E Mini Generations, a sort of greatest hits collection culled from Reddit.
Quantum Mirror lets Sony shooters send images straight from their camera to the DALL-E 2 AI
More commonly known for his adventures with 3D printing, he's turned his hand towards coding. This time, he's produced an app for Sony mirrorless cameras called Quantum Mirror that submits the photos you shoot to the DALL-E 2 AI to generate variants of them for some interesting and sometimes surreal results. Nick stresses that he's not affiliated with either Sony or OpenAI and this is not an official app. In fact, it's so unofficial that he also provided a disclaimer stating that its use will get your account banned – they considered it a "web scraping tool", apparently. So, he says not to install it, but to check back once the beta is over and there's a public API available.