transframer
Deepmind: Transframer AI dreams 30-second video from an image
Deepmind's new video AI, Transframer, can handle a whole range of image and video tasks – and dream up 30-second videos from a single frame. Generative AI systems have moved from research labs to industrial and consumer applications in recent years, kicked off by OpenAI's large-scale language model GPT-3. Then last April, the company introduced the DALL-E 2 imaging system, which indirectly spawned alternatives such as Midjourney and Stable Diffusion. Google sister Deepmind is now showing Transframer, an AI model that could offer a glimpse of the next generation of generative AI models. Deepmind's Transframer is a visual prediction framework that can solve eight image modeling and processing tasks at once, such as depth estimation, instance segmentation, object recognition or video prediction.
Google's DeepMind AI can 'transframe' a single image into a video
Earlier this week, the team behind Google's advanced DeepMind neural network unveiled a new ability dubbed Transframer, which allows AI to generate 30-second videos from a single image input. It's a nifty little trick at first glance, but the implications are much larger than an interesting .GIF file. Transframer is a general-purpose generative framework that can handle many image and video tasks in a probabilistic setting. New work shows it excels in video prediction and view synthesis, and can generate 30s videos from a single image: https://t.co/wX3nrrYEEa "Transframer is state-of-the-art on a variety of video generation benchmarks, and… can generate coherent 30 second videos from a single image without any explicit geometric information," the DeepMind research team explains.
Google Scientists Create AI That Can Generate Videos From One Frame
Google's DeepMind neural network has demonstrated that it can dream up short videos from a single image frame, and it's really cool to see how it works. As DeepMind noted on Twitter, the artificial intelligence model, named "Transframer" -- that's a riff on a "transformer," a common type of AI tool that whips up text based on partial prompts -- "excels in video prediction and view synthesis," and is able to "generate 30 [second] videos from a single image." Transframer is a general-purpose generative framework that can handle many image and video tasks in a probabilistic setting. New work shows it excels in video prediction and view synthesis, and can generate 30s videos from a single image: https://t.co/wX3nrrYEEa As the Transframer website notes, the AI makes its perspective videos by predicting the target images' surroundings with "context images" -- in short, by correctly guessing what one of the chairs below would look like from different perspectives based on extensive training data that lets it "imagine" an actual object from another angle.