ZS-MSTM: Zero-Shot Style Transfer for Gesture Animation driven by Text and Speech using Adversarial Disentanglement of Multimodal Style Encoding
Fares, Mireille, Pelachaud, Catherine, Obin, Nicolas
–arXiv.org Artificial Intelligence
Embodied Conversational Agents are virtually embodied agents with a human-like appearance that are capable of autonomously communicating with people in a socially intelligent manner using multimodal behaviors (Lugrin [2021]). The field of research in ECAs has emerged as new interface between humans and machines. ECAs behaviors are often modeled from human communicative behaviors. They are endowed with the capacities to recognize and generate verbal and non-verbal cues (Lugrin [2021]), and are envisioned to support humans in their daily lives. Our work revolves around modeling multimodal data and learning the complex correlations between the different modalities employed in human communication. More specifically, the objective is to model the multimodal ECAs' behavior with their behavior style. Human behavior style is a socially meaningful clustering of features found within and across multiple modalities, specifically in linguistic (Campbell-Kibler et al. [2006]), spoken behavior such as the speaking style conveyed by speech prosody (Moon et al. [2022], Obin [2011]), and nonverbal behavior such as hand gestures and body posture (Obermeier et al. [2015], Wagner et al. [2014]).
arXiv.org Artificial Intelligence
May-22-2023
- Genre:
- Research Report > New Finding (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks
- Deep Learning (0.94)
- Natural Language > Chatbot (0.86)
- Representation & Reasoning > Agents (1.00)
- Vision (1.00)
- Machine Learning > Neural Networks
- Information Technology > Artificial Intelligence