Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding

Open in new window