Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection

Open in new window