Goto

Collaborating Authors

 jasper model


Jasper: A Breakthrough in Speech Recognition Technology

#artificialintelligence

Speech recognition technology has come a long way in recent years, with advances in deep learning algorithms and hardware capabilities leading to more accurate and effective models. One such model is Jasper, a deep time delay neural network (TDNN) that utilizes 1D-convolutional layers in its design. Speech recognition has always been an interesting area of research and has seen numerous advancements over the years. However, a recent work by Jason Li et al source on speech recognition introduced a revolutionary deep time delay neural network (TDNN) called Jasper (Just Another Speech Recognizer). Jasper is a collection of models, each with a unique number of layers, denoted as Jasper bxr, where b represents the number of blocks and r represents the number of times each convolution layer within a block is repeated.


NVIDIA/DeepLearningExamples

#artificialintelligence

This repository provides scripts to train the Jasper model to achieve near state of the art accuracy and perform high-performance inference using NVIDIA TensorRT. This repository is tested and maintained by NVIDIA. The Jasper model is an end-to-end neural acoustic model for automatic speech recognition (ASR) that provides near state-of-the-art results on LibriSpeech among end-to-end ASR models without any external data. The Jasper architecture of convolutional layers was designed to facilitate fast GPU inference, by allowing whole sub-blocks to be fused into a single GPU kernel. This is important for meeting strict real-time requirements of ASR systems in deployment. The results of the acoustic model are combined with the results of external language models to get the top-ranked word sequences corresponding to a given audio segment.