Robust Multi-Modal Speech In-Painting: A Sequence-to-Sequence Approach