Samba-ASR: State-Of-The-Art Speech Recognition Leveraging Structured State-Space Models

Shakhadri, Syed Abdul Gaffar, KR, Kruthika, Angadi, Kartik Basavaraj

Jan-8-2025–arXiv.org Artificial Intelligence

The rapid evolution of deep learning has significantly transformed Automatic Speech Recognition (ASR), shifting from traditional systems such as Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) to advanced end-to-end neural architectures. While innovations such as Connectionist Temporal Classification (CTC) and attentionbased encoder-decoder models have established new baselines [1], transformer-based models like OpenAI's Whisper have further pushed the boundaries, setting state-of-the-art benchmarks for multilingual, multitask ASR systems [2]. Despite their successes, transformer architectures face inherent challenges in scaling to long sequences, particularly those encountered in extended audio recordings.

architecture, samba-asr, sequence, (14 more...)

arXiv.org Artificial Intelligence

Jan-8-2025

arXiv.org PDF

Add feedback

Country:
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (0.46)
- Media (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found