TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation

Oct-10-2024, 17:51:35 GMT–Neural Information Processing Systems

Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences. Sign videos consist of continuous sequences of sign gestures with no clear boundaries in between. Existing SLT models usually represent sign visual features in a frame-wise manner so as to avoid needing to explicitly segmenting the videos into isolated signs. However, these methods neglect the temporal information of signs and lead to substantial ambiguity in translation. In this paper, we explore the temporal semantic structures of sign videos to learn more discriminative features.

hierarchical feature learning, sign language translation, temporal semantic pyramid, (5 more...)

Neural Information Processing Systems

Oct-10-2024, 17:51:35 GMT

Conferences Web Page

Add feedback

Industry:
- Education > Curriculum > Subject-Specific Education (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (0.75)
  - Natural Language > Machine Translation (0.64)