Joint Speech and Text Training for LLM-Based End-to-End Spoken Dialogue State Tracking

Open in new window