Impulse Response -- data augmentation for audio deep learning
In recent years, deep learning for audio has come a long way with models beating traditional signal processing techniques in many of the downstream tasks. However, many such solutions are trained on "homogeneous" datasets -- datasets where there is little variability in the recording conditions (noise, accent, language, etc.). Many such models do not perform very well (especially audio conversion/synthesis tasks) when used on real world "audio events" which can contain short burst, environment noises, background speakers, poor microphones, etc. While there are many techniques address them, here we concern ourselves with data augmentation with impulse responses, which at times can be really powerful since it simulates different recording environments. An impulse response of a dynamic system describes how it reacts when presented with a brief input signal called the impulse.
Aug-22-2021, 01:40:11 GMT
- Technology: