Can Contextual Biasing Remain Effective with Whisper and GPT-2?

Sun, Guangzhi, Zheng, Xianrui, Zhang, Chao, Woodland, Philip C.

arXiv.org Artificial Intelligence 

Therefore, it is essential to enhance End-to-end automatic speech recognition (ASR) and large language the performance of these models on such domain-specific models, such as Whisper and GPT-2, have recently been words without sacrificing their capacity for generalisation, and scaled to use vast amounts of training data. Despite the large contextual biasing is one possible solution. Contextual biasing amount of training data, infrequent content words that occur in aims to incorporate contextual knowledge into end-to-end ASR a particular task may still exhibit poor ASR performance, with systems. Contextual knowledge is often represented as a biasing contextual biasing a possible remedy. This paper investigates list, comprising words that carry important information and the effectiveness of neural contextual biasing for Whisper combined are essential to downstream tasks, such as restaurant names in with GPT-2. Specifically, this paper proposes integrating an ontology for a task-oriented dialogue system. The inclusion an adapted tree-constrained pointer generator (TCPGen) component of these words in the biasing list has been shown to significantly for Whisper and a dedicated training scheme to dynamically improve recognition performance.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found