Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
–Neural Information Processing Systems
Audio-driven human animation methods, such as talking head and talking body generation, have made remarkable progress in generating synchronized facial movements and appealing visual quality videos. However, existing methods primarily focus on single human animation and struggle with multi-stream audio inputs, facing incorrect binding problems between audio and persons.
Neural Information Processing Systems
Jun-12-2026, 13:35:13 GMT
- Technology: