Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Open in new window