Gloss2Text: Sign Language Gloss translation using LLMs and Semantically Aware Label Smoothing

Fayyazsanavi, Pooya, Anastasopoulos, Antonios, Košecká, Jana

Jul-12-2024–arXiv.org Artificial Intelligence

Sign language translation from video to spoken text presents unique challenges owing to the distinct grammar, expression nuances, and high variation of visual appearance across different speakers and contexts. The intermediate gloss annotations of videos aim to guide the translation process. In our work, we focus on Gloss2Text translation stage and propose several advances by leveraging pre-trained large language models (LLMs), data augmentation, and novel label-smoothing loss function exploiting gloss translation ambiguities Figure 1: An example of ambiguity in sign language is improving significantly the performance of demonstrated by the gloss "BEWOELKT (CLOUDY)," state-of-the-art approaches. Through extensive which is represented in multiple translations within the experiments and ablation studies on the dataset. As shown, ambiguity may share the same meaning PHOENIX Weather 2014T dataset, our approach but differ in form, such as "wolken (cloudy)," or surpasses state-of-the-art performance where the gloss represents the concept meaning, such in Gloss2Text translation, indicating its efficacy as "unbeständig (unstable)." in addressing sign language translation and suggesting promising avenues for future research and development.

language translation, translation, translation task, (12 more...)

arXiv.org Artificial Intelligence

Jul-12-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report > Promising Solution (0.48)

Industry:
- Education > Curriculum > Subject-Specific Education (1.00)

Technology:
- Information Technology > Artificial Intelligence > Natural Language
  - Machine Translation (1.00)
  - Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found