Approaching Dialogue State Tracking via Aligning Speech Encoders and LLMs