Scaling Sign Language Translation

May-27-2025, 17:11:51 GMT–Neural Information Processing Systems

Sign language translation (SLT) addresses the problem of translating information from a sign language in video to a spoken language in text. Existing studies, while showing progress, are often limited to narrow domains and/or few sign languages and struggle with open-domain tasks. In this paper, we push forward the frontier of SLT by scaling pretraining data, model size, and number of translation directions. We perform large-scale SLT pretraining on different data including 1) noisy multilingual Youtube SLT data,2) parallel text corpora, and 3) SLT data augmented by translating video captions to other languages with off-the-shelf machine translation models. We unify different pretraining tasks with task-specific prompts under the encoder-decoder architecture, and initialize the SLT model with pretrained (m/By)T5 models across model sizes.

model size, scaling sign language translation, slt, (2 more...)

Neural Information Processing Systems

May-27-2025, 17:11:51 GMT

Conferences Web Page

Add feedback

Industry:
- Education > Curriculum > Subject-Specific Education (1.00)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.98)