DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment

Open in new window