FromDiscreteTokenstoHigh-FidelityAudioUsing Multi-BandDiffusion

Neural Information Processing Systems 

Deep generativemodels cangenerate high-fidelity audio conditioned onvarious types of representations (e.g., mel-spectrograms, Mel-frequency Cepstral Coefficients (MFCC)). Recently, such models have been used to synthesize audio waveforms conditioned on highly compressed representations.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found