VHASR: A Multimodal Speech Recognition System With Vision Hotwords

Open in new window