Improving the fusion of acoustic and text representations in RNN-T

Open in new window