Instruction Data Generation and Unsupervised Adaptation for Speech Language Models