Mixtures of Experts for Audio-Visual Learning