Speech-text based multi-modal training with bidirectional attention for improved speech recognition

Open in new window