Enhancing Speech Large Language Models through Reinforced Behavior Alignment