AITopics | make-a-video

Collaborating Authors

make-a-video

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

Girdhar, Rohit, Singh, Mannat, Brown, Andrew, Duval, Quentin, Azadi, Samaneh, Rambhatla, Sai Saketh, Shah, Akbar, Yin, Xi, Parikh, Devi, Misra, Ishan

arXiv.org Artificial IntelligenceNov-17-2023

We present Emu Video, a text-to-video generation model that factorizes the generation into two steps: first generating an image conditioned on the text, and then generating a video conditioned on the text and the generated image. We identify critical design decisions--adjusted noise schedules for diffusion, and multi-stage training--that enable us to directly generate high quality and high resolution videos, without requiring a deep cascade of models as in prior work. In human evaluations, our generated videos are strongly preferred in quality compared to all prior work--81% vs. Google's Imagen Video, 90% vs. Nvidia's PYOCO, and 96% vs. Meta's Make-A-Video. Our model outperforms commercial solutions such as RunwayML's Gen2 and Pika Labs. Finally, our factorizing approach naturally lends itself to animating images based on a user's text prompt, where our generations are preferred 96% over prior work.

evaluation, human evaluation, video, (14 more...)

arXiv.org Artificial Intelligence

2311.10709

Country:

North America > United States (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

ChatGPT-4 is coming this week and will be able to turn text into VIDEO

Daily Mail - Science & techMar-13-2023, 12:53:44 GMT

ChatGPT, the revolutionary chatbot powered by artificial intelligence (AI), will soon be able to do much more than send human-like text messages. A Microsoft executive has revealed that the next version - set to be released this week - will be able to turn text prompts into unique videos. The tech giant has invested heavily in ChatGPT, and has already unveiled a host of new products which incorporate it as an AI assistant, like search engine Bing. But this updated version, dubbed GPT-4 and tipped to launch on Thursday, will have'multimodal models', according to Microsoft Germany CTO Andreas Braun. This means that it will be able to generate content in multiple formats, like audio clips, images and video clips, from a text prompt.

chatgpt, text prompt, video, (15 more...)

Daily Mail - Science & tech

Country:

Europe > Germany (0.25)
Europe > Ukraine (0.15)
Europe > Russia (0.06)
(2 more...)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The 5 most important recent developments in AI

#artificialintelligenceNov-3-2022, 06:10:20 GMT

From solving maths and science problems to translating with astonishing accuracy between hundreds of languages – not to mention generating images and videos based on a natural language prompt – AI is making strides pretty much across the board. In this article, I'll briefly discuss some of the most recent (and the most exciting!) So, without further ado, let's dive in! Released on 1 August 2022, Minerva is a language model capable of not only solving maths and science problems submitted in the form of natural language, but also of providing its reasoning behind the answer. So far, Google has built three versions of the model, getting bigger with each iteration.

dataset, important recent development, language model, (8 more...)

#artificialintelligence

Industry: Education (0.30)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

How AI Transformed the Art World in 2022

#artificialintelligenceNov-1-2022, 12:08:57 GMT

The AI community has a new obsession. It's called'generative artificial intelligence', and it refers to the idea of having computers take over creative tasks such as writing, filmmaking, and graphic design. AI art generators are paving a new path towards the freedom of artistic expression. In an extremely short period, they've allowed everybody with internet access and a keyboard to generate incredible art from simple text prompts. Considering the current state of things, it's too early to tell whether this new wave of apps will end up costing artists and illustrators their jobs. What seems clear though is that these tools are already being used in creative industries.

diffusion, diffusion model, stable diffusion, (13 more...)

#artificialintelligence

Country: North America > United States > Colorado (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.70)

Add feedback

Meta Announces Video Generation AI Model Make-a-Video

#artificialintelligenceOct-25-2022, 13:01:14 GMT

Meta AI recently announced Make-A-Video, a text-to-video generation AI model. Make-A-Video is trained using publicly available image-text pairs and video-only data and achieves state-of-the-art performance on the UCF-101 video-generation benchmark. The model and a set of experiments were described in a paper published on arXiv. Unlike some other text-to-video (T2V) models, Make-a-Video does not require a dataset of text-video pairs. Instead, it is based on existing text-image pair models, which generate single-frame images from a text description.

benchmark, make-a-video, video generation ai model make-a-video, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Make-A-Video: Text-to-Video Generation's Next... Generation? - NAB Amplify

#artificialintelligenceOct-19-2022, 19:35:41 GMT

The inevitable has happened, albeit a little sooner than expected. After all the hoopla surrounding text-to-image AI generators in recent months, Meta is first out of the gate with a text-to-video version. Perhaps Meta wanted to establish some headline leadership in this space, since the results aren't ready for primetime. But as developments in text-to-image generation has shown, by the time you read this the technology will already have advanced. Meta is only giving a glimpse to the public at the tech it calls Make-A-Video.

artificial intelligence, machine learning, make-a-video, (12 more...)

#artificialintelligence

Industry: Media > Film (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.33)

Add feedback

📝 📺 Edge#234: Inside Meta AI's Make-A-Video

#artificialintelligenceOct-13-2022, 13:45:49 GMT

On Thursdays, we dive deep into one of the freshest research papers or technology frameworks that is worth your attention. Our goal is to keep you up to date with new developments in AI to complement the concepts we debate in other editions of our newsletter. Text-to-Video (T2V) is considered the next frontier for generative artificial intelligence (AI) models. While the text-to-image (T2I) space is experiencing a revolution with models like DALL-E, Stable Diffusion, and Midjouney, T2V still remains a monumental challenge. Recently, researchers from Meta AI unveiled Make-A-Video, a T2V model able to create realistic short video clips from textual inputs.

edge, make-a-video, meta ai

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.68)

Add feedback

La veille de la cybersécurité

#artificialintelligenceOct-10-2022, 06:30:52 GMT

Not to be outdone by Meta's Make-A-Video, Google today detailed its work on Imagen Video, an AI system that can generate video clips given a text prompt (e.g. While the results aren't perfect -- the looping clips the system generates tend to have artifacts and noise -- Google claims that Imagen Video is a step toward a system with a "high degree of controllability" and world knowledge, including the ability to generate footage in a range of artistic styles. As my colleague Devin Coldewey noted in his piece about Make-A-Video, text-to-video systems aren't new. Earlier this year, a group of researchers from Tsinghua University and the Beijing Academy of Artificial Intelligence released CogVideo, which can translate text into reasonably high-fidelity short clips. But Imagen Video appears to be a significant leap over the previous state-of-the-art, showing an aptitude for animating captions that existing systems would have trouble understanding. "It's definitely an improvement," Matthew Guzdial, an assistant professor at the University of Alberta studying AI and machine learning, told TechCrunch via email.

imagen video, make-a-video, veille

#artificialintelligence

Country:

North America > Canada > Alberta (0.61)
Asia > China > Beijing > Beijing (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.61)

Add feedback

Meta enters the AI arms race with a creepy DALL-E 2 for video

#artificialintelligenceOct-7-2022, 06:03:52 GMT

AI image generation has been let loose and it seems there's no going back. With DALL-E 2 now open to all, another player has entered the fray not wanting to lose out – and it's none other than Facebook's parent company Meta. And while DALL-E 2 currently works its magic only with static images, Meta's revealed that it's working on a similar tool for video. Like with AI image generators such as DALL-E 2, users will be able to type in a descriptive text prompt, and the tool will generate four output options. Named Make-A-Video (give them a break, they were too busy with the tech to work on names) isn't yet public, but Meta AI has been doing requests on Twitter. The results are as creepy as they are astonishing.

dall-e 2, make-a-video, twitter, (12 more...)

#artificialintelligence

Industry: Information Technology (0.36)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Google answers Meta's video-generating AI with its own, dubbed Imagen Video

#artificialintelligenceOct-5-2022, 19:45:14 GMT

Not to be outdone by Meta's Make-A-Video, Google today detailed its work on Imagen Video, an AI system that can generate video clips given a text prompt (e.g., "a teddy bear washing dishes"). While the results aren't perfect -- the looping clips the system generates tend to have artifacts and noise -- Google claims that Imagen Video is a step toward a system with a "high degree of controllability" and world knowledge, including the ability to generate footage in a range of artistic styles. As my colleague Devin Coldewey noted in his piece about Make-A-Video, text-to-video systems aren't new. Earlier this year, a group of researchers from Tsinghua University and the Beijing Academy of Artificial Intelligence released CogVideo, which can translate text into reasonably-high-fidelity short clips. But Imagen Video appears to be a significant leap over the previous state-of-the-art, showing an aptitude for animating captions that existing systems would have trouble understanding.

image credit, imagen video, video, (8 more...)

#artificialintelligence

Country:

Asia > China > Beijing > Beijing (0.25)
North America > Canada > Alberta (0.16)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback