Steering Language Model to Stable Speech Emotion Recognition via Contextual Perception and Chain of Thought