MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset
–Neural Information Processing Systems
Isolated Sign Language Recognition (ISLR) focuses on identifying individual sign language signs. Considering the diversity of sign languages across geographical regions, developing region-specific ISLR datasets is crucial for supporting communication and research. Auslan, as a sign language specific to Australia, still lacks a dedicated large-scale word-level dataset for the ISLR task. To fill this gap, we curate the first large-scale Multi-view Multi-modal Word-Level Australian Sign Language recognition dataset, dubbed MM-WLAuslan. Compared to other publicly available datasets, MM-WLAuslan exhibits three significant advantages: (1) the largest amount of data, (2) the most extensive vocabulary, and (3) the most diverse of multi-modal camera views. Specifically, we record 282K+ sign videos covering 3,215 commonly used Auslan glosses presented by 73 signers in a studio environment. Moreover, our filming system includes two different types of cameras, i.e., three Kinect-V2 cameras and a RealSense camera. We position cameras hemispherically around the front half of the model and simultaneously record videos using all four cameras.
Neural Information Processing Systems
May-30-2025, 12:24:19 GMT
- Country:
- Asia > Middle East
- Israel (0.14)
- Europe > Austria
- Vienna (0.14)
- North America > United States (1.00)
- Oceania > Australia (0.66)
- Asia > Middle East
- Genre:
- Research Report (0.93)
- Industry:
- Education > Curriculum > Subject-Specific Education (1.00)
- Technology: