Closing the Gap Between Text and Speech Understanding in LLMs