Non-Autoregressive Sign Language Production via Knowledge Distillation
Hwang, Eui Jun, Kim, Jung Ho, Cho, Suk Min, Park, Jong C.
–arXiv.org Artificial Intelligence
Sign Language Production (SLP) aims to translate expressions in spoken language into corresponding ones in sign language, such as skeleton-based sign poses or videos. Existing SLP models are either AutoRegressive (AR) or Non-Autoregressive (NAR). However, AR-SLP models suffer from regression to the mean and error propagation during decoding. NSLP-G, a NAR-based model, resolves these issues to some extent but engenders other problems. For example, it does not consider target sign lengths and suffers from false decoding initiation. We propose a novel NAR-SLP model via Knowledge Distillation (KD) to address these problems. First, we devise a length regulator to predict the end of the generated sign pose sequence. We then adopt KD, which distills spatial-linguistic features from a pre-trained pose encoder to alleviate false decoding initiation. Extensive experiments show that the proposed approach significantly outperforms existing SLP models in both Frechet Gesture Distance and Back-Translation evaluation.
arXiv.org Artificial Intelligence
Aug-12-2022
- Country:
- Asia > South Korea
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America
- Dominican Republic (0.04)
- United States (0.04)
- Genre:
- Instructional Material > Course Syllabus & Notes (0.46)
- Research Report (0.82)
- Industry:
- Education > Curriculum > Subject-Specific Education (0.85)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks (1.00)
- Natural Language (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence