DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment