RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction

Jun-2-2024–arXiv.org Artificial Intelligence

Recent advancements in generative modeling have significantly enhanced the reconstruction of audio waveforms from various representations. While diffusion models are adept at this task, they are hindered by latency issues due to their operation at the individual sample point level and the need for numerous sampling steps. In this study, we introduce RFWave, a cutting-edge multi-band Rectified Flow approach designed to reconstruct high-fidelity audio waveforms from Mel-spectrograms or discrete tokens. RFWave uniquely generates complex spectrograms and operates at the frame level, processing all subbands simultaneously to boost efficiency. Leveraging Rectified Flow, which targets a flat transport trajectory, RFWave achieves reconstruction with just 10 sampling steps. Our empirical evaluations show that RFWave not only provides outstanding reconstruction quality but also offers vastly superior computational efficiency, enabling audio generation at speeds up to 97 times faster than real-time on a GPU.

complex spectrogram, rfwave, spectrogram, (13 more...)

arXiv.org Artificial Intelligence

Jun-2-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > California
    - Los Angeles County > Long Beach (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Italy
  - Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia
  - South Korea
    - Seoul > Seoul (0.04)
    - Incheon > Incheon (0.04)
  - China > Guangdong Province
    - Shenzhen (0.04)
- Africa > Rwanda
  - Kigali > Kigali (0.04)

Genre:
- Research Report > New Finding (0.88)

Industry:
- Information Technology (0.68)
- Media > Music (0.46)
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Speech (1.00)
  - Natural Language (1.00)
  - Representation & Reasoning (0.93)
  - Vision (0.93)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found