Goto

Collaborating Authors

 transframer


Deepmind: Transframer AI dreams 30-second video from an image

#artificialintelligence

Deepmind's new video AI, Transframer, can handle a whole range of image and video tasks – and dream up 30-second videos from a single frame. Generative AI systems have moved from research labs to industrial and consumer applications in recent years, kicked off by OpenAI's large-scale language model GPT-3. Then last April, the company introduced the DALL-E 2 imaging system, which indirectly spawned alternatives such as Midjourney and Stable Diffusion. Google sister Deepmind is now showing Transframer, an AI model that could offer a glimpse of the next generation of generative AI models. Deepmind's Transframer is a visual prediction framework that can solve eight image modeling and processing tasks at once, such as depth estimation, instance segmentation, object recognition or video prediction.


Google's DeepMind AI can 'transframe' a single image into a video

#artificialintelligence

Earlier this week, the team behind Google's advanced DeepMind neural network unveiled a new ability dubbed Transframer, which allows AI to generate 30-second videos from a single image input. It's a nifty little trick at first glance, but the implications are much larger than an interesting .GIF file. Transframer is a general-purpose generative framework that can handle many image and video tasks in a probabilistic setting. New work shows it excels in video prediction and view synthesis, and can generate 30s videos from a single image: https://t.co/wX3nrrYEEa "Transframer is state-of-the-art on a variety of video generation benchmarks, and… can generate coherent 30 second videos from a single image without any explicit geometric information," the DeepMind research team explains.


Google Scientists Create AI That Can Generate Videos From One Frame

#artificialintelligence

Google's DeepMind neural network has demonstrated that it can dream up short videos from a single image frame, and it's really cool to see how it works. As DeepMind noted on Twitter, the artificial intelligence model, named "Transframer" -- that's a riff on a "transformer," a common type of AI tool that whips up text based on partial prompts -- "excels in video prediction and view synthesis," and is able to "generate 30 [second] videos from a single image." Transframer is a general-purpose generative framework that can handle many image and video tasks in a probabilistic setting. New work shows it excels in video prediction and view synthesis, and can generate 30s videos from a single image: https://t.co/wX3nrrYEEa As the Transframer website notes, the AI makes its perspective videos by predicting the target images' surroundings with "context images" -- in short, by correctly guessing what one of the chairs below would look like from different perspectives based on extensive training data that lets it "imagine" an actual object from another angle.