TSLFormer: A Lightweight Transformer Model for Turkish Sign Language Recognition Using Skeletal Landmarks
Ertürk, Kutay, Altınışık, Furkan, Sarıaltın, İrem, Gerek, Ömer Nezih
–arXiv.org Artificial Intelligence
--This study presents TSLFormer, a light and robust word-level T urkish Sign Language (TID) recognition model that treats sign gestures as ordered, string-like language. In contrast to working with raw RGB or depth videos, our method only works with 3D joint positions--articulation points--extracted using Google's Mediapipe library, which focuses on the hand and torso skeletal locations. This creates efficient input dimensionality reduction with significant preservation of important semantic information of the gesture. Our approach revisits sign language recognition as sequence-to-sequence translation, drawing inspiration from sign languages' linguistic nature and transformer's success at natural language translation. Since TSLFormer adapts the transformers' self-attention mechanism, it effectively represents the temporal co-occurrence of a sign sequence, stressing significant movement habits over time as words are referenced in a sentence. Experimented and validated on the AUTSL dataset holding over 36,000 sign samples of over 226 different words, the TSLFormer achieves competitive performance and with minimal computational demands. From the experimentation, rich spatiotemporal understanding of signs is evidenced, and using only joint landmarks, it is possible within any real-time, mobile, and assistive technology facilitating communication between hearing-impaired members. Sign language is an essential communication method for the hearing impaired to express ideas and sentiments through hand gestures, facial expressions, and body movement. Unlike spoken languages, which employ auditory and verbal modalities, sign language utilizes visual and spatial modalities to express meaning. However, despite the limited number of sign language proficient individuals, communication gaps still exist to hinder inclusion--particularly in social interaction on a daily basis and in employment, educational, and healthcare environments.
arXiv.org Artificial Intelligence
Jun-19-2025
- Country:
- Asia > Middle East
- Republic of Türkiye > Eskisehir Province > Eskisehir (0.05)
- North America > United States
- Massachusetts (0.04)
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Education > Curriculum > Subject-Specific Education (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning (0.66)
- Natural Language (1.00)
- Vision (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence