CAT: CRF-based ASR Toolkit

An, Keyu, Xiang, Hongyu, Ou, Zhijian

arXiv.org Machine Learning 

ABSTRACT In this paper, we present a new open source toolkit for automatic speech recognition (ASR), named CA T (CRF-based ASR Toolkit). A key feature of CA T is discriminative training in the framework of conditional random field (CRF), particularly with connectionist temporal classification (CTC) inspired state topology. CA T contains a full-fledged implementation of CTC-CRF and provides a complete workflow for CRF-based end-to-end speech recognition. Evaluation results on Chinese and English benchmarks such as Switchboard and Aishell show that CA T obtains the state-of-the-art results among existing end-to-end models with less parameters, and is competitive compared with the hybrid DNN-HMM models. Towards flexibility, we show that i-vector based speaker-adapted recognition and latency control mechanism can be explored easily and effectively in CA T. We hope CA T, especially the CRF-based framework and software, will be of broad interest to the community, and can be further explored and improved. Index T erms-- speech recognition, open source toolkit, conditional random field, end-to-end 1. INTRODUCTION In addition to theories and algorithms, open source toolkits make substantial contributions to automatic speech recognition (ASR) technologies.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found