The NGT200 Dataset: Geometric Multi-View Isolated Sign Recognition
Ranum, Oline, Wessels, David R., Otterspeer, Gomer, Bekkers, Erik J., Roelofsen, Floris, Andersen, Jari I.
–arXiv.org Artificial Intelligence
Sign Language Processing (SLP) provides a foundation for a more inclusive future in language technology; however, the field faces several significant challenges that must be addressed to achieve practical, real-world applications. This work addresses multi-view isolated sign recognition (MV-ISR), and highlights the essential role of 3D awareness and geometry in SLP systems. We introduce the NGT200 dataset, a novel spatio-temporal multi-view benchmark, establishing MV-ISR as distinct from single-view ISR (SV-ISR). We demonstrate the benefits of synthetic data and propose conditioning sign representations on spatial symmetries inherent in sign language. Leveraging an SE(2) equivariant model improves MV-ISR performance by 8%-22% over the baseline.
arXiv.org Artificial Intelligence
Sep-3-2024
- Country:
- Europe
- Austria > Vienna (0.14)
- Italy > Piedmont
- Turin Province > Turin (0.14)
- Middle East > Malta (0.14)
- North America > United States (0.93)
- Europe
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Education (0.61)
- Health & Medicine (0.68)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks (0.93)
- Natural Language (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence