Adaptive Inner Speech-Text Alignment for LLM-based Speech Translation