Cross-Representation Transferability of Adversarial Perturbations: From Spectrograms to Audio Waveforms

Koerich, Karl M., Esmailpour, Mohammad, Abdoli, Sajjad, Britto, Alceu S. Jr., Koerich, Alessandro L.

arXiv.org Machine Learning 

This paper shows the susceptibility of spectrogram-based audio classifiers to adversarial attacks and the transferability of such attacks to audio waveforms. Some commonly adversarial attacks to images have been applied to Mel-frequency and short-time Fourier transform spectrograms and such perturbed spectrograms are able to fool a 2D convolutional neural network (CNN) for music genre classification with a high fooling rate and high confidence. Such attacks produce perturbed spectrograms that are visually imperceptible by humans. Experimental results on a dataset of western music have shown that the 2D CNN achieves up to 81.87% of mean accuracy on legitimate examples and such a performance drops to 12.09% on adversarial examples. Furthermore, the audio signals reconstructed from the adversarial spectrograms produce audio waveforms that perceptually resemble the legitimate audio.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found