Generalized zero-shot audio-to-intent classification