Tackling Interpretability in Audio Classification Networks with Non-negative Matrix Factorization

Parekh, Jayneel, Parekh, Sanjeel, Mozharovskyi, Pavlo, Richard, Gaël, d'Alché-Buc, Florence

May-11-2023–arXiv.org Artificial Intelligence

This paper tackles two major problem settings for interpretability of audio processing networks, post-hoc and by-design interpretation. For post-hoc interpretation, we aim to interpret decisions of a network in terms of high-level audio objects that are also listenable for the end-user. This is extended to present an inherently interpretable model with high performance. To this end, we propose a novel interpreter design that incorporates non-negative matrix factorization (NMF). In particular, an interpreter is trained to generate a regularized intermediate embedding from hidden layers of a target network, learnt as time-activations of a pre-learnt NMF dictionary. Our methodology allows us to generate intuitive audio-based interpretations that explicitly enhance parts of the input signal most relevant for a network's decision. We demonstrate our method's applicability on a variety of classification tasks, including multi-label data for real-world audio and music.

artificial intelligence, interpretation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

May-11-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (0.82)

Industry:
- Leisure & Entertainment (0.46)
- Media > Music (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found