mermaid
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation
Wang, Zun, Li, Jialu, Lin, Han, Yoon, Jaehong, Bansal, Mohit
Storytelling video generation (SVG) has recently emerged as a task to create long, multi-motion, multi-scene videos that consistently represent the story described in the input text script. SVG holds great potential for diverse content creation in media and entertainment; however, it also presents significant challenges: (1) objects must exhibit a range of fine-grained, complex motions, (2) multiple objects need to appear consistently across scenes, and (3) subjects may require multiple motions with seamless transitions within a single scene. To address these challenges, we propose DreamRunner, a novel story-to-video generation method: First, we structure the input script using a large language model (LLM) to facilitate both coarse-grained scene planning as well as fine-grained object-level layout and motion planning. Next, DreamRunner presents retrieval-augmented test-time adaptation to capture target motion priors for objects in each scene, supporting diverse motion customization based on retrieved videos, thus facilitating the generation of new videos with complex, scripted motions. Lastly, we propose a novel spatial-temporal region-based 3D attention and prior injection module SR3AI for fine-grained object-motion binding and frame-by-frame semantic control. We compare DreamRunner with various SVG baselines, demonstrating state-of-the-art performance in character consistency, text alignment, and smooth transitions. Additionally, DreamRunner exhibits strong fine-grained condition-following ability in compositional text-to-video generation, significantly outperforming baselines on T2V-ComBench. Finally, we validate DreamRunner's robust ability to generate multi-object interactions with qualitative examples.
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts
Singh, Shubhankar, Chaurasia, Purvi, Varun, Yerram, Pandya, Pranshu, Gupta, Vatsal, Gupta, Vivek, Roth, Dan
Existing benchmarks for visual question answering lack in visual grounding and complexity, particularly in evaluating spatial reasoning skills. We introduce FlowVQA, a novel benchmark aimed at assessing the capabilities of visual question-answering multimodal language models in reasoning with flowcharts as visual contexts. FlowVQA comprises 2,272 carefully generated and human-verified flowchart images from three distinct content sources, along with 22,413 diverse question-answer pairs, to test a spectrum of reasoning tasks, including information localization, decision-making, and logical progression. We conduct a thorough baseline evaluation on a suite of both open-source and proprietary multimodal language models using various strategies, followed by an analysis of directional bias. The results underscore the benchmark's potential as a vital tool for advancing the field of multimodal modeling, providing a focused and challenging environment for enhancing model performance in visual and logical reasoning tasks.
- North America > United States > California > Santa Clara County > San Jose (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (4 more...)
- Workflow (1.00)
- Research Report > New Finding (0.87)
Here's the cream of the crop from the Day of the Devs Game Awards stream
Day of the Devs is awesome. It's a showcase that pops up a few times a year to promote promising, in-progress indie games, irrespective of publisher, genre, budget, visual style or release window. It's curated by the folks at Double Fine and iam8bit, and they've been hosting Day of the Devs live events and digital showcases for the past 11 years. The latest Day of the Devs celebration wrapped up on December 6, the day before The Game Awards, and it featured 20 marvelous and strange independent projects. The virtual show included a few world premieres and release date announcements, but mostly, it was a celebration of creativity and innovation in indie games.
- North America > United States > Alaska (0.05)
- Asia > South Korea (0.05)
'Racist' AI scientist blasted for 'fixing' black Ariel in 'The Little Mermaid'
Twitter decided two users could not be part of their world after an artificial intelligence scientist "whitewashed" actress Halle Bailey in "The Little Mermaid" trailer. In addition to suspending their respective accounts, the AI guy has been blased by other users on the site for digitally replacing Bailey -- who is black -- with a fake white actress. The viral tweet circulated days after the new Disney film's trailer reportedly received over 1.5 million dislikes on YouTube from "racist" viewers who are upset that the previously white-skinned, red-headed Ariel is now a black woman. Twitter user @TenGazillioinIQ took that to the next level and "fixed" the clip by using AI to make the live-action fish woman white. "Credits to our member Artificial Intelligence scientist @TenGazillioinIQ," the tweet -- made by another user, @vandalibm, read, according to screenshots taken by DailyMail before the account was suspended.
Here's what Disney Princesses would look like in real life according to AI
Ever wondered what Disney Princesses would look like in real life according to artificial intelligence (AI)? Well, wonder no more, as a recent TikTok video has gone viral for using AI on some of the most popular animated Disney Princesses to imagine their live-action counterparts. The TikTok video, which you can see below, was uploaded by Tony Aubé, a Silicon Valley designer who previously worked at Google AI. The 20-second clip shows images of Frozen's Elsa, Aladdin's Jasmine, The Little Mermaid's Ariel, and the titular Moana against their respective digitally reimagined AI designs, which Aubé was able to make with the help of video reenactment technology. Though not a Disney character, the video also featured an AI reimagining of Princess Fiona from the DreamWorks franchise, Shrek.
Lego Leverages Disney, Robots And Healthcare In Its Best Sets Of 2017
At New York Toy Fair last week Lego unveiled most of its set for Spring, Summer and Fall of 2017, here are the best ones. Whether you are a Lego collector or family builders, the selection is somewhat overwhelming. Clear though was that Disney and movie licences were driving the line. Whether it was sets from Frozen, Cinderella, The Little Mermaid, Star Wars or Moana the attention to film-inspired detail was impressive. Along with these were a number of new Minecraft sets, new kits for Ninjago, The Lego Batman Movie. Then there were Lego's own brands, Elves, Friends, City and Nexo Knights.
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.26)
- North America > United States > New York (0.25)
- Information Technology > Artificial Intelligence > Robots (0.53)
- Information Technology > Artificial Intelligence > Games (0.35)
Warcraft: The Beginning proves a monster hit in China
Warcraft: The Beginning, the adaptation of the video game World of Warcraft, has proved a massive hit in China, with a first-day take of 46m ( 31.8m), the second biggest in the country's history after another Hollywood hit, Furious 7, which took 63.1m in its first 24 hours in 2015. Warcraft's impressive results put it on course to challenge Furious 7's 150m opening instalment in China – and thoroughly dwarfs the projected result for its domestic release in the US, which is currently tracking for around 25m when it opens on Friday. Warcraft's success in China has been ascribed partly to the game's popularity there – according to the International Business Times (IBT), an estimated half of the world's players are based in the country – and partly to the favourable position the film acquired in the release calendar due to the participation of a number of powerful Chinese enterprises in Warcraft: notably the purchase of production outfit Legendary Pictures by conglomerate Dalian Wanda. However, box-office analysts don't expect Warcraft to trouble any Hollywood-import grossing records, such as Furious 7's total of 390m, let alone domestic record holder The Mermaid, which finished with 526m earlier this year. Related: A bigger splash: how did The Mermaid become China's biggest ever film?
- North America > United States (0.27)
- Asia > China > Liaoning Province > Dalian (0.27)
- Media > Film (0.59)
- Leisure & Entertainment > Games > Computer Games (0.59)
Does This Terrifying Robot Really Have to Look Like a Mermaid?
Mermaids, and their less famous comrades the mermen, are beautiful beings that have mastered the underwater world. They also have a more sinister rep as vicious bastards that drag sailors to their watery doom. So perhaps it's no wonder that a new mermaid-like robot from Stanford, the OceanOne--with its graceful, streamlined body but oh, also, claws and dead eyes--elicits mixed emotions. Sure, it looks like you and me, but it's just rather more, well, electronic. Grab a pitchfork and vow to hunt it down. But OceanOne is in fact an emblem of a battle over the future of robotics: Humanoid bots are getting roboticists riled up, and not just because they're creepy.
Diving Robot 'Mermaid' Lends a Hand (or 2) to Ocean Exploration
In Mediterranean waters, off the coast of France, a diver recently visited the shipwreck La Lune -- a vesssel in King Louis XIV's fleet -- which lay untouched and unexplored on the ocean bottom since it sank in 1664. But the wreck's first nonaquatic visitor in centuries wasn't human -- it was a robot. Dubbed "OceanOne," the bright orange diving robot resembles a mecha-mermaid. It measures about 5 feet (1.5 meters) in length and has a partly human form: a torso, a head -- with stereoscopic vision -- and articulated arms. Its lower section holds its computer "brain," a power supply, and an array of eight multidirectional thrusters.
- Europe > France (0.25)
- North America > United States > California (0.05)
- Indian Ocean > Red Sea (0.05)
- (6 more...)