Generative AI
RePrompt: Automatic Prompt Editing to Refine AI-Generative Art Towards Precise Expressions
Wang, Yunlong, Shen, Shuyuan, Lim, Brian Y.
Generative AI models have shown impressive ability to produce images with text prompts, which could benefit creativity in visual art creation and self-expression. However, it is unclear how precisely the generated images express contexts and emotions from the input texts. We explored the emotional expressiveness of AI-generated images and developed RePrompt, an automatic method to refine text prompts toward precise expression of the generated images. Inspired by crowdsourced editing strategies, we curated intuitive text features, such as the number and concreteness of nouns, and trained a proxy model to analyze the feature effects on the AI-generated image. With model explanations of the proxy model, we curated a rubric to adjust text prompts to optimize image generation for precise emotion expression. We conducted simulation and user studies, which showed that RePrompt significantly improves the emotional expressiveness of AI-generated images, especially for negative emotions.
SpaText: Spatio-Textual Representation for Controllable Image Generation
Avrahami, Omri, Hayes, Thomas, Gafni, Oran, Gupta, Sonal, Taigman, Yaniv, Parikh, Devi, Lischinski, Dani, Fried, Ohad, Yin, Xi
Recent text-to-image diffusion models are able to generate convincing results of unprecedented quality. However, it is nearly impossible to control the shapes of different regions/objects or their layout in a fine-grained fashion. Previous attempts to provide such controls were hindered by their reliance on a fixed set of labels. To this end, we present SpaText - a new method for text-to-image generation using open-vocabulary scene control. In addition to a global text prompt that describes the entire scene, the user provides a segmentation map where each region of interest is annotated by a free-form natural language description. Due to lack of large-scale datasets that have a detailed textual description for each region in the image, we choose to leverage the current large-scale text-to-image datasets and base our approach on a novel CLIP-based spatio-textual representation, and show its effectiveness on two state-of-the-art diffusion models: pixel-based and latent-based. In addition, we show how to extend the classifier-free guidance method in diffusion models to the multi-conditional case and present an alternative accelerated inference algorithm. Finally, we offer several automatic evaluation metrics and use them, in addition to FID scores and a user study, to evaluate our method and show that it achieves state-of-the-art results on image generation with free-form textual scene control.
All the new AI features for Gmail, Google Docs, and Slides
Google Workspace is getting a generative AI boost at the same time that many other productivity suites are adding new features that allow users to simplify clerical tasks with just a prompt. Following up on the visual redesign to Google Docs and the announcement of Google Bard, these new AI features are the company's latest attempt to bring more buzzy goodness to its most popular applications. Google announced in a blog on Tuesday, its next frontier of AI support, which will span its Workspace suite, with support for Gmail, Docs, Slides, Sheets, Meet, and Chat. The host of writing-based features will be available first to a set of English-based testers in the U.S. throughout the year and then made public at a later date. Much like other AI generators, users will be able to input their prompt within their Workspace, such as Docs and Gmail and the AI will do the rest. Google uses examples including "a busy HR professional who needs to create customized job descriptions, or a parent drafting the invitation for your child's pirate-themed birthday party."
New AI can "reimagine" your pictures in infinite ways
UK/California-based tech startup Stability AI has launched Stable Diffusion Reimagine, an image-to-image AI that generates brand new pictures inspired by one uploaded by a user -- and it's going to be open sourced. The background: 2022 saw the release of a number of impressive text-to-image AIs -- programs that can create images based on text prompts -- with one of the most popular examples being Stability AI's Stable Diffusion. A major reason for this popularity was that, unlike DALL-E 2 and most other text-to-image AIs, Stable Diffusion was open source -- users could access the code and make unique models, such as ones that only generated Pokรฉmon or artwork in their personal style. Stability AI has now announced the release of a new tool called Stable Diffusion Reimagine; instead of generating new images based on text prompts, it creates ones inspired by uploaded images. Stable Diffusion already had a feature called "img2img" that allowed users to upload images along with a text prompt to guide the AI.
How AMD Is Working to Conquer Generative AI: A Difficult Task Ahead (NASDAQ:AMD) - Bytefeed - News Powered by AI
Artificial intelligence (AI) is becoming increasingly important in the technology industry, and Advanced Micro Devices (AMD) is one of the leading companies in this space. AMD has been investing heavily in AI research and development, and its efforts are beginning to pay off. The company recently announced a new generative AI platform that could revolutionize how businesses use machine learning. Generative AI is an advanced form of artificial intelligence that can generate data from scratch without relying on existing datasets or models. This type of technology has the potential to create entirely new products and services by leveraging existing data sets for training purposes.
Ignite Friday Digital Marketing News (Updated Every Friday)
This week: TikTok challenges Google and Microsoft with search ads, GPT-4 is on the way, and social media engagement rates are dropping. Here's what happened this week in digital marketing. OpenAI hasn't been in the news enough lately so it's time for a fresh update. The next version of GPT, unimaginatively called GPT-4, will go live soon. In fact, it might already be live by the time you read this. As far as the updates that make it more worthwhile than GPT-3, it's got multimodal functionality. That means it supports text, speech, images, and even video. GPT-4 also works across multiple languages. If you've noticed that your social media engagement rates are on the decline, you're not alone.
Generative AI: Imagining a future of AI-dominated creativity
Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. AI-generated media has reached an explosive tipping point. Even before the debut of OpenAI's ChatGPT electrified the internet, the research laboratory captured the attention of the art and design world for its generative AI system, DALL-E, allowing anyone to create images of anything their heart desires by simply entering a few words or phrases. Over the past months, more than a million users have signed up to use DALL-E beta, and the company is further expanding its reach by offering an API so that creators, developers and businesses can integrate this powerful technology and further explore its creative potential. Meanwhile, AI-generated work continues to disrupt other corners of the cultural landscape, from the six-figure sale of the generative portrait at Christie's in 2018 to this year's controversial awarding of a top prize to an AI artwork in a contest for emerging artists. Follow VentureBeat's ongoing generative AI coverage The arrival of AI creations in the highest echelons of the art world and the proliferation of user-friendly AI software like DALL-E 2, Midjourney and Lensa have renewed debate over creative production and ownership, and prompted attempts to provide practical answers to questions previously relegated to the realm of theory: What differentiates a machine-made painting from a work of art?
Deep Learning Eliminated Creativity in AI โ Zbigatron
It was a huge breakthrough (circa 2012) that allowed AI to blast into the headlines and into our lives like never before. ChatGPT, DALL-E 2, autonomous cars, etc. โ deep learning is the engine driving these stories. DL is so good, that it has reached a point where every solution to a problem involving AI is now most probably being solved using it. Just take a look at any academic conference/workshop and scan through the presented publications. Now, DL is great, don't get me wrong.
Generative UI Design: Einstein, Galileo, and the AI Design Process
Open AI, Stable Diffusion and the likes have enabled a range of products to bring AI-assisted copywriting, image creation, and even coding to our fingertips. It was just a matter of time until generative AI made its way over to User Interface design. What if we could generate User Interfaces automatically? To take it even a step further, what if we could predict UIs? It's only recently that these AI-driven interface design tools are becoming commercially available. There's Galileo, Genius, Magician, and probably many more to come. This current wave of'generative AI' might seem pretty new for UI design, but work in this area has been ongoing for years, with notable progress back in 2020.
ChatGPT 4 & Generative AI Tips & News; 'We are a little bit scared': OpenAI CEO; Apple is experimenting with language-generating AI; Spring budget 23.
The 2nd annual Semantic Layer Summit is here! Registration is now open for this FREE one-day virtual event feat. As generative AI models grow larger and more powerful, some scientists advocate for leaner, more energy-efficient systems. Don't forget to sign up for NVIDIA GTC on the 21st of March to learn about the latest breakthroughs in Generative AI and Tech. Join here for free for a chance to win a Geforce 4080 or one of the 10 Nvidia DLI vouchers.