Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation