Speech Recognition Transformers: Topological-lingualism Perspective
Singh, Shruti, Singh, Muskaan, Kadyan, Virender
–arXiv.org Artificial Intelligence
Transformers have evolved with great success in various artificial intelligence tasks. Thanks to our recent prevalence of self-attention mechanisms, which capture long-term dependency, phenomenal outcomes in speech processing and recognition tasks have been produced. The paper presents a comprehensive survey of transformer techniques oriented in speech modality. The main contents of this survey include (1) background of traditional ASR, end-to-end transformer ecosystem, and speech transformers (2) foundational models in a speech via lingualism paradigm, i.e., monolingual, bilingual, multilingual, and cross-lingual (3) dataset and languages, acoustic features, architecture, decoding, and evaluation metric from a specific topological lingualism perspective (4) popular speech transformer toolkit for building end-to-end ASR systems. Finally, highlight the discussion of open challenges and potential research directions for the community to conduct further research in this domain.
arXiv.org Artificial Intelligence
Aug-27-2024
- Country:
- Africa (0.04)
- North America > United States
- New York (0.04)
- Asia
- East Asia (0.04)
- China (0.04)
- India > Uttarakhand
- Dehradun (0.04)
- Genre:
- Overview (1.00)
- Research Report > Promising Solution (0.45)
- Industry:
- Health & Medicine > Therapeutic Area (0.46)
- Education (0.45)
- Media (0.45)
- Technology: