Scaling Sign Language Translation

Feb-18-2026, 05:22:25 GMT–Neural Information Processing Systems

Sign language translation (SL T) addresses the problem of translating information from a sign language in video to a spoken language in text. Existing studies, while showing progress, are often limited to narrow domains and/or few sign languages and struggle with open-domain tasks. In this paper, we push forward the frontier of SL T by scaling pretraining data, model size, and number of translation directions. We perform large-scale SL T pretraining on different data including 1) noisy multilingual Y ouTube SL T data, 2) parallel text corpora, and 3) SL T data augmented by translating video captions to other languages with off-the-shelf machine translation models. We unify different pretraining tasks with task-specific prompts under the encoder-decoder architecture, and initialize the SL T model with pretrained (m/By)T5 models across model sizes. SL T pretraining results on How2Sign and FLEURS-ASL#0 (ASL to 42 spoken languages) demonstrate the significance of data/model scaling and cross-lingual cross-modal transfer, as well as the feasibility of zero-shot SL T. We finetune the pretrained SL T models on 5 downstream open-domain SL T benchmarks covering 5 sign languages. Experiments show substantial quality improvements over the vanilla baselines, surpassing the previous state-of-the-art (SOT A) by wide margins.

large language model, machine learning, translation, (20 more...)

Neural Information Processing Systems

Feb-18-2026, 05:22:25 GMT

Conferences PDF

Add feedback

Country:
- Oceania > New Zealand (0.04)
- South America
  - Uruguay (0.04)
  - Paraguay (0.04)
  - Colombia > Meta Department
    - Villavicencio (0.04)
- North America
  - United States (0.04)
  - Canada > Quebec (0.04)
- Europe
  - Switzerland (0.04)
  - Spain (0.04)
  - Slovenia (0.04)
  - Serbia (0.04)
  - Belarus (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - Singapore (0.04)
  - Macao (0.04)
  - Indonesia > Bali (0.04)
  - Vietnam (0.04)
  - Taiwan (0.04)
  - China > Hong Kong (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Education > Curriculum > Subject-Specific Education (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Machine Translation (1.00)
    - Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.67)

Duplicate Docs Excel Report

Title
ced76a666704e381c3039871ffe558ee-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found