PESTalk: Speech-Driven 3D Facial Animation with Personalized Emotional Styles

Han, Tianshun, Zhou, Benjia, Liu, Ajian, Liang, Yanyan, Zhang, Du, Lei, Zhen, Wan, Jun

Dec-8-2025–arXiv.org Artificial Intelligence

PESTalk is a novel method for generating 3D facial animations with personalized emotional styles directly from speech. It overcomes key limitations of existing approaches by introducing a Dual-Stream Emotion Extractor (DSEE) that captures both time and frequency-domain audio features for fine-grained emotion analysis, and an Emotional Style Modeling Module (ESMM) that models individual expression patterns based on voiceprint characteristics. To address data scarcity, the method leverages a newly constructed 3D-EmoStyle dataset. Evaluations demonstrate that PESTalk outperforms state-of-the-art methods in producing realistic and personalized facial animations.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Dec-8-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.15)

Genre:
- Research Report > Promising Solution (0.68)

Technology:
- Information Technology
  - Graphics > Animation (0.99)
  - Artificial Intelligence
    - Natural Language (1.00)
    - Vision (0.96)
    - Machine Learning > Neural Networks (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found