Generative AI
Text to Image Generation: Leaving no Language Behind
Reviriego, Pedro, Merino-Gómez, Elena
One of the latest applications of Artificial Intelligence (AI) is to generate images from natural language descriptions. These generators are now becoming available and achieve impressive results that have been used for example in the front cover of magazines. As the input to the generators is in the form of a natural language text, a question that arises immediately is how these models behave when the input is written in different languages. In this paper we perform an initial exploration of how the performance of three popular text-to-image generators depends on the language. The results show that there is a significant performance degradation when using languages other than English, especially for languages that are not widely used. This observation leads us to discuss different alternatives on how text-to-image generators can be improved so that performance is consistent across different languages. This is fundamental to ensure that this new technology can be used by non-native English speakers and to preserve linguistic diversity.
New DALL-E integration adds generative AI for next-level slides
Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers. For Tome, which calls itself the "new storytelling format for work and important ideas," integrating OpenAI's DALL-E into its flexible, interactive slide options -- which it announced today -- was a natural fit to add a generative AI dimension to decks. When OpenAI announced the release of the DALL-E API in early November, the San-Francisco-based startup had its chance. "Making that a part of the storytelling creation experience just felt really natural," Tome CEO Keith Peiris told VentureBeat. "It felt so much more powerful than looking for a stock photo or clip art -- it's kind of giving us a first look at what generative storytelling can look like."
Why Salesforce is betting on generative AI for conversational workflows
Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers. Salesforce's AI research is heavily focused on generative AI techniques to provide a fully conversational workflow, according to Silvio Savarese, EVP and chief scientist at Salesforce. In a world with increasing workloads -- where even highly trained experts are expected to do more with less -- as well as constant information overload and the need to master complex tools, harnessing the power of simple conversation is incredibly useful, he says. In a recent Salesforce Research blog post, Saverase called conversation "a kind of universal interface for human collaboration." That's why Salesforce developed its open-source large-scale language model, CodeGen, which is competitive with OpenAI's Codex (which, in turn, powers GitHub Copilot) and turns simple English prompts into executable code.
The AI Empathy Crisis
AI language models (LMs) have recently gotten so skilled as to be believable--even deceptive. Not in the sense of intentionally fooling people, but in the sense of being capable of generating utterances that would make us imagine a mind behind the screen. We--gullible humans with a tendency to anthropomorphize non-living objects--are the perfect victims of this trap. As access to LMs becomes widespread, many will start doubting. Some will even claim certainty: "AI is alive and sentient."
"Architects can rest easy that AI isn't coming for their jobs just yet"
Despite the justified controversy surrounding AI art, architects need not worry about being usurped by software that can generate images of buildings, argues Will Wiles. These are uncertain times, but we can be sure of two things. The first is that art made by artificial intelligence (AI) is here to stay. Please feel free to imagine those marks if you prefer.) The second is that AI art will remain controversial, and rightly so. Human artists fear, quite reasonably, that it will consume much of the bread-and-butter work on which they depend.
GPT-4 Rumors From Silicon Valley
GPT-4 is possibly the most anticipated AI model in history. In 2020, GPT-3 surprised everyone with a huge performance leap from GPT-2 and set unprecedented expectations for its successor. But for two years OpenAI has been super shy about GPT-4--letting out info in dribs and drabs and remaining silent for the most part. People have been talking these months. What I've heard from several sources: GPT-4 is almost ready and will be released (hopefully) sometime December-February.
Why the collapse of Sam Bankman-Fried's FTX has split A.I. researchers
First, we need to clear up terminology, like A.I. Safety, which sounds like a completely neutral, uncontroversial term. Who wouldn't want safe A.I. software? And you might think that the definition of A.I. "safety" would include A.I. that isn't racist or sexist or is used to abet genocide. All of which, by the way, are actual, documented concerns about today's existing A.I. software. Yet actually, none of those concerns are what A.I. researchers generally mean when they talk about "A.I. Instead, those things fall into the camp of "A.I.
How DeviantArt is navigating the AI art minefield
But the real problem is that, especially without extensive AI expertise, smaller platforms can only do so much. The agenda so far is being set by fast-moving AI startups like OpenAI and Stability, as well as tech giants like Google. Beyond simply banning AI-generated work, there's no easy way to navigate the system without touching what's become a third rail to many artists. "This is not something that DeviantArt can fix on our own," admits Gurwicz. "Until there's proper regulation in place, it does require these AI models and platforms to go beyond just what is legally required and think about, ethically, what's right and what's fair."
How Descript's generative AI makes video editing as easy as updating text
Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers. A podcaster steps up to a mic to do a review of a new chicken nugget brand. As he begins talking and recording himself on his laptop, real-time speech-to-text transcribes his comments: "So these nuggets are, um, made from chicken, but they're made to um, um, um, um, emulate the taste of, like, like, non chicken nuggets." That doesn't sound very professional; on his screen, he strikes through those filler words -- and while he's at it, boosts the podcast's sound quality before publishing it for his audience. This is one use case for audio-video editing tool Descript, which today announced a significant product update and a $50 million series C round led by the OpenAI Startup Fund. "The whole concept of Descript -- editing video like a doc -- is only possible because of AI [artificial intelligence]," said Jay LeBoeuf, Descript's head of business and corporate development.
The scary truth about AI copyright is nobody knows what will happen next
Regardless of where we land on these legal questions, the various actors in the generative AI field are already gearing up for… something. The companies making millions from this tech are entrenching themselves: repeatedly declaring that everything they're doing is legal (while presumably hoping no one actually challenges this claim). Getty Images recently banned AI content because of the potential legal risk to customers ("I don't think it's responsible.