Exploiting Ensemble Learning for Cross-View Isolated Sign Language Recognition

Wang, Fei, Li, Kun, Nie, Yiqi, Duan, Zhangling, Zou, Peng, Wu, Zhiliang, Wang, Yuwei, Wei, Yanyan

Feb-4-2025–arXiv.org Artificial Intelligence

In this paper, we present our solution to the Cross-View Isolated Sign Language Recognition (CV-ISLR) challenge held at WWW 2025. CV-ISLR addresses a critical issue in traditional Isolated Sign Language Recognition (ISLR), where existing datasets predominantly capture sign language videos from a frontal perspective, while real-world camera angles often vary. To accurately recognize sign language from different viewpoints, models must be capable of understanding gestures from multiple angles, making cross-view recognition challenging. To address this, we explore the advantages of ensemble learning, which enhances model robustness and generalization across diverse views. Our approach, built on a multi-dimensional Video Swin Transformer model, leverages this ensemble strategy to achieve competitive performance. Finally, our solution ranked 3rd in both the RGB-based ISLR and RGB-D-based ISLR tracks, demonstrating the effectiveness in handling the challenges of cross-view recognition. The code is available at: https://github.com/Jiafei127/CV_ISLR_WWW2025.

data mining, machine learning, wang, (16 more...)

arXiv.org Artificial Intelligence

Feb-4-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County > New York City (0.04)
- Asia > China
  - Anhui Province > Hefei (0.07)
  - Zhejiang Province > Hangzhou (0.04)

Genre:
- Research Report (0.50)

Industry:
- Education > Curriculum > Subject-Specific Education (1.00)

Technology:
- Information Technology
  - Data Science > Data Mining (0.90)
  - Artificial Intelligence
    - Vision (1.00)
    - Machine Learning > Neural Networks (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found