Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?

Open in new window