Tailored Design of Audio-Visual Speech Recognition Models using Branchformers