Duplex Diffusion Models Improve Speech-to-Speech Translation

May-21-2023–arXiv.org Artificial Intelligence

Speech-to-speech translation is a typical sequence-to-sequence learning task that naturally has two directions. How to effectively leverage bidirectional supervision signals to produce high-fidelity audio for both directions? Existing approaches either train two separate models or a multitask-learned model with low efficiency and inferior performance. In this paper, we propose a duplex diffusion model that applies diffusion probabilistic models to both sides of a reversible duplex Conformer, so that either end can simultaneously input and output a distinct language's speech. Our model enables reversible speech translation by simply flipping the input and output ends. Experiments show that our model achieves the first success of reversible speech translation with significant improvements of ASR-BLEU scores compared with a list of state-of-the-art baselines.

machine learning, natural language, translation, (19 more...)

arXiv.org Artificial Intelligence

May-21-2023

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States
  - Pennsylvania > Philadelphia County
    - Philadelphia (0.04)
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
- Europe
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
  - Germany > Baden-Württemberg
    - Karlsruhe Region > Heidelberg (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
- Asia
  - Thailand > Chiang Mai
    - Chiang Mai (0.04)
  - Japan
    - Kyūshū & Okinawa > Kyūshū
      - Miyazaki Prefecture > Miyazaki (0.04)
    - Honshū > Tōhoku
      - Iwate Prefecture > Morioka (0.04)

Genre:
- Research Report (0.83)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language > Machine Translation (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found