Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation

Open in new window