marionette
MarioNette: Self-Supervised Sprite Learning
Artists and video game designers often construct 2D animations using libraries of sprites---textured patches of objects and characters. We propose a deep learning approach that decomposes sprite-based video animations into a disentangled representation of recurring graphic elements in a self-supervised manner. By jointly learning a dictionary of possibly transparent patches and training a network that places them onto a canvas, we deconstruct sprite-based content into a sparse, consistent, and explicit representation that can be easily used in downstream tasks, like editing or analysis. Our framework offers a promising approach for discovering recurring visual patterns in image collections without supervision.
MarioNette: Self-Supervised Sprite Learning
Artists and video game designers often construct 2D animations using libraries of sprites---textured patches of objects and characters. We propose a deep learning approach that decomposes sprite-based video animations into a disentangled representation of recurring graphic elements in a self-supervised manner. By jointly learning a dictionary of possibly transparent patches and training a network that places them onto a canvas, we deconstruct sprite-based content into a sparse, consistent, and explicit representation that can be easily used in downstream tasks, like editing or analysis. Our framework offers a promising approach for discovering recurring visual patterns in image collections without supervision.
Wearable Haptics for a Marionette-inspired Teleoperation of Highly Redundant Robotic Systems
Torielli, Davide, Franco, Leonardo, Pozzi, Maria, Muratore, Luca, Malvezzi, Monica, Tsagarakis, Nikos, Prattichizzo, Domenico
The teleoperation of complex, kinematically redundant robots with loco-manipulation capabilities represents a challenge for human operators, who have to learn how to operate the many degrees of freedom of the robot to accomplish a desired task. In this context, developing an easy-to-learn and easy-to-use human-robot interface is paramount. Recent works introduced a novel teleoperation concept, which relies on a virtual physical interaction interface between the human operator and the remote robot equivalent to a "Marionette" control, but whose feedback was limited to only visual feedback on the human side. In this paper, we propose extending the "Marionette" interface by adding a wearable haptic interface to cope with the limitations given by the previous works. Leveraging the additional haptic feedback modality, the human operator gains full sensorimotor control over the robot, and the awareness about the robot's response and interactions with the environment is greatly improved. We evaluated the proposed interface and the related teleoperation framework with naive users, assessing the teleoperation performance and the user experience with and without haptic feedback. The conducted experiments consisted in a loco-manipulation mission with the CENTAURO robot, a hybrid leg-wheel quadruped with a humanoid dual-arm upper body.
- Europe > Italy (0.04)
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
- Research Report (0.82)
- Questionnaire & Opinion Survey (0.69)
MarioNette: Self-Supervised Sprite Learning
Artists and video game designers often construct 2D animations using libraries of sprites---textured patches of objects and characters. We propose a deep learning approach that decomposes sprite-based video animations into a disentangled representation of recurring graphic elements in a self-supervised manner. By jointly learning a dictionary of possibly transparent patches and training a network that places them onto a canvas, we deconstruct sprite-based content into a sparse, consistent, and explicit representation that can be easily used in downstream tasks, like editing or analysis. Our framework offers a promising approach for discovering recurring visual patterns in image collections without supervision.
A conceptual advance that gives microrobots legs
In 1959, Nobel laureate and nanotechnology visionary Richard Feynman suggested that it would be interesting to "swallow the surgeon" -- that is, to make a tiny robot that could travel through blood vessels to carry out surgery where needed. This iconic imagining of the future underscored modern hopes for the field of micrometre-scale robotics: to deploy autonomous devices in environments that their macroscopic counterparts cannot reach. However, the construction of such robots presents several challenges, including the obvious difficulty of how to assemble a microscopic locomotive device. In a paper in Nature, Miskin et al.1 report electrochemically driven devices that propel laser-controlled microrobots through a liquid, and which could be easily integrated with microelectronics components to construct fully autonomous microrobots. Designing propulsion strategies for microrobots that move through liquid environments is challenging because strong drag forces prevent microscale objects from maintaining momentum2.
SenseTime's AI generates realistic deepfake videos
In late 2019, researchers at Seoul-based Hyperconnect developed a tool (MarioNETte) that could manipulate the facial features of a historical figure, a politician, or a CEO using nothing but a webcam and still images. More recently, a team hailing from Hong Kong-based tech giant SenseTIme, Nanyang Technological University, and the Chinese Academy of Sciences' Institute of Automation proposed a method of editing target portrait footage by taking sequences of audio to synthesize photo-realistic videos. As opposed to MarioNETte, SenseTime's technique is dynamic, meaning it's able to better handle media it hasn't before encountered. And the results are impressive, albeit worrisome in light of recent developments involving deepfakes. The coauthors of the study describing the work note that the task of "many-to-many" audio-to-video translation -- that is, translation that doesn't assume a single identity of source video and the target video -- is challenging.
- Asia > South Korea > Seoul > Seoul (0.25)
- Asia > China > Hong Kong (0.25)
- North America > United States > New York (0.05)
- (3 more...)
Researchers train AI to map a person's facial movements to any target headshot
What if you could manipulate the facial features of a historical figure, a politician, or a CEO realistically and convincingly using nothing but a webcam and an illustrated or photographic still image? A tool called MarioNETte that was recently developed by researchers at Seoul-based Hyperconnect accomplishes this, thanks in part to cutting-edge machine learning techniques. The researchers claim it outperforms all baselines even where there's "significant" mismatch between the face to be manipulated and the person doing the manipulating. MarioNETte is technically a face reenactment tool, in that it aims to synthesize a reenacted face animated by the movement of a person (a "driver") while preserving the face's (target's) appearance. It's not a new idea, but previous approaches either (1) required a few minutes of training data and could only reenact predefined targets, or (2) would distort the target's features when dealing with large poses.
- Asia > South Korea > Seoul > Seoul (0.26)
- North America > United States (0.06)
MarioNETte: Few-Shot Identity Preservation in Facial Reenactment
If you've ever wanted to see Einstein play charades, Rodin's "The Thinker" wink at you, or an ancient Chinese Emperor cast in a Chaplin movie -- then the AI-powered video transformation tech you're looking for is "face reenactment," which can digitally deliver all such fantastic scenarios. Unlike face swapping, which transfers a face from one source to another, face reenactment captures the movements of a driver face and expresses them through the identity of a target face. Starting with a dynamic driver face, researchers can manipulate any target face -- from today's celebrities to historical figures, including any age, ethnicity or gender -- to perform any humanly possible face-based task. Previous approaches at synthesizing a reenacted face used generative adversarial networks (GAN), which have demonstrated tremendous ability is a wide range of image generation tasks. GAN-based models however require at least a few minutes of training data for each target.