A Comparative Analysis of Recurrent and Attention Architectures for Isolated Sign Language Recognition
Alishzade, Nigar, Abdullayeva, Gulchin
–arXiv.org Artificial Intelligence
This study presents a systematic comparative analysis of recurrent and attention-based neural architectures for isolated sign language recognition. We implement and evaluate two representative models-ConvLSTM and Vanilla Transformer-on the Azerbaijani Sign Language Dataset (AzSLD) and the Word-Level American Sign Language (WLASL) dataset. Our results demonstrate that the attention-based Vanilla Transformer consistently outperforms the recurrent ConvLSTM in both Top-1 and Top-5 accuracy across datasets, achieving up to 76.8% Top-1 accuracy on AzSLD and 88.3% on WLASL. The ConvLSTM, while more computationally efficient, lags in recognition accuracy, particularly on smaller datasets. These findings highlight the complementary strengths of each paradigm: the Transformer excels in overall accuracy and signer independence, whereas the ConvLSTM offers advantages in computational efficiency and temporal modeling. The study provides a nuanced analysis of these trade-offs, offering guidance for architecture selection in sign language recognition systems depending on application requirements and resource constraints.
arXiv.org Artificial Intelligence
Nov-18-2025
- Country:
- Africa > Mali (0.04)
- Asia
- Azerbaijan
- Baku Economic Region > Baku (0.04)
- Karabakh Economic Region > Khankendi (0.04)
- India > Maharashtra
- Pune (0.04)
- Azerbaijan
- Europe
- France > Île-de-France
- Switzerland > Basel-City
- Basel (0.04)
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Education > Curriculum > Subject-Specific Education (1.00)
- Technology: