Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation