Can Contextual Biasing Remain Effective with Whisper and GPT-2?

Sun, Guangzhi, Zheng, Xianrui, Zhang, Chao, Woodland, Philip C.

Jun-2-2023–arXiv.org Artificial Intelligence

Therefore, it is essential to enhance End-to-end automatic speech recognition (ASR) and large language the performance of these models on such domain-specific models, such as Whisper and GPT-2, have recently been words without sacrificing their capacity for generalisation, and scaled to use vast amounts of training data. Despite the large contextual biasing is one possible solution. Contextual biasing amount of training data, infrequent content words that occur in aims to incorporate contextual knowledge into end-to-end ASR a particular task may still exhibit poor ASR performance, with systems. Contextual knowledge is often represented as a biasing contextual biasing a possible remedy. This paper investigates list, comprising words that carry important information and the effectiveness of neural contextual biasing for Whisper combined are essential to downstream tasks, such as restaurant names in with GPT-2. Specifically, this paper proposes integrating an ontology for a task-oriented dialogue system. The inclusion an adapted tree-constrained pointer generator (TCPGen) component of these words in the biasing list has been shown to significantly for Whisper and a dedicated training scheme to dynamically improve recognition performance.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Jun-2-2023

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Austria > Styria
    - Graz (0.04)
- Asia
  - South Korea > Incheon
    - Incheon (0.04)
  - China > Shanghai
    - Shanghai (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.96)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found