Signal Combination for Language Identification

Wang, Shengye, Wan, Li, Yu, Yang, Moreno, Ignacio Lopez

Oct-21-2019–arXiv.org Machine Learning

ABSTRACT Google's multilingual speech recognition system combines low-level acoustic signals with language-specific recognizer signals to better predict the language of an utterance. This paper presents our experience with different signal combination methods to improve overall language identification accuracy. We compare the performance of a lattice-based ensemble model and a deep neural network model to combine signals from recognizers with that of a baseline that only uses low-level acoustic signals. Experimental results show that the deep neural network model outperforms the lattice-based ensemble model, and it reduced the error rate from 5 .5% in the baseline to 4 .3%, Index T erms-- Signal combination, language identification, lattice regression, deep neural network 1. INTRODUCTION Multilingual speech recognition is an important feature for modern speech recognition systems allowing users to speak in more than a single, preset language.

accuracy, international conference, language identification, (11 more...)

arXiv.org Machine Learning

Oct-21-2019

arXiv.org PDF

Add feedback

Country:
- Africa > South Africa (0.04)
- North America
  - United States > California
    - San Diego County
      - San Diego (0.04)
      - La Jolla (0.04)
  - Canada > Alberta
    - Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
- Europe
  - United Kingdom > England
    - East Sussex > Brighton (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)

Genre:
- Research Report > New Finding (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.77)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found