Tailored Design of Audio-Visual Speech Recognition Models using Branchformers

Open in new window