text-to-video ai
Why text-to-video may be the next 'big' AI thing - Times of India
Generative Artificial Intelligence (AI) is expanding beyond text-to-image models, with the emergence of text-to-video. Runway's Gen-2 model and Google's Imagen Video and Phenaki models create videos based on text prompts, but the challenge lies in achieving precision and avoiding fake or misleading videos. Ethical challenges also arise, as AI-generated videos could be used for deception via the creation of deepfakes. However, with Big Tech already involved in the development of text-to-video models, it may not be long before this technology becomes mainstream. When it comes to generative AI, there's only one thing dominating the headlines -- ChatGPT.
AI already turns text prompts into stunning art. Next up: video
Runway has shouldered aside Midjourney and Stable Diffusion, introducing the first clips of text-to-video AI art that the company says is completely generated by a text prompt. The company said that it's offering a waitlist to join what it calls "Gen 2" of text-to-video AI, after offering a similar waitlist for its first, simpler text-to-video tools that use a real-world scene as a model. When AI art emerged last year, it used a text-to-image model. A user would input a text prompt describing the scene, and the tool would attempt to create an image using what it knew of real-world "seeds," artistic styles and so forth. Services like Midjourney perform these tasks on a cloud server, while Stable Diffusion and Stable Horde take advantage of similar AI models running on home PCs.
The Download: text-to-video AI, and China's big methanol bet
What's happened: Meta has unveiled an AI system that generates short videos based on text prompts. Make-A-Video lets you type in a string of words, like "A dog wearing a superhero outfit with a red cape flying through the sky," and then generates a five-second clip that, while pretty accurate, has the aesthetics of a trippy old home video. How it works: Meta combined data from three open-source image and video data sets to train its model. Standard text-image data sets of labeled still images helped the AI learn what objects are called and what they look like. And a database of videos helped it learn how those objects are supposed to move in the world. Why it matters: Although the effect is rather crude, the system offers an early glimpse of what's coming next for generative artificial intelligence, and it is the next obvious step from the text-to-image AI systems that have caused huge excitement this year.