Deep Learning for Audio Signal Processing

Purwins, Hendrik, Li, Bo, Virtanen, Tuomas, Schlüter, Jan, Chang, Shuo-yiin, Sainath, Tara

Apr-30-2019–arXiv.org Machine Learning

Personal use of this material is permitted. Abstract--Given the recent surge in developments of deep x learning, this article provides a review of the state-of-the-art input sequence deep learning techniques for audio signal processing. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic The number of labels to be predicted (left), and the type of each label (right). While many deep learning methods have been adopted from I. INTRODUCTION Audio [2] in 1986, and finally 3) the success of deep learning in signals are commonly transformed into two-dimensional timefrequency speech recognition [3] and image classification [4] in 2012, representations for processing, but the two axes, leading to a renaissance of deep learning, involving e.g. Images are instantaneous snapshots networks (CNNs, [6]) and long short-term memory (LSTM, of a target and often analyzed as a whole or in patches [7]). In this "deep" paradigm, architectures with a large number with little order constraints; however audio signals have to be of parameters are trained to learn from a massive amount of studied sequentially in chronological order. METHODS many areas of signal processing, often outperforming traditional To set the stage, we give a conceptual overview of audio signal processing on a large scale. In this most recent analysis and synthesis problems (II-A), the input representations wave, deep learning first gained traction in image processing commonly used to address them (II-B), and the models [4], but was then widely adopted in speech processing, music shared between different application fields (II-C). H. Purwins is with Department of Architecture, Design & Media Technology, This division encompasses two independent axes (cf. Manuscript received October 11, 2018 While the audio signal will often be processed into a sequence of features, This is a PREPRINT we consider this part of the solution, not of the task. JOURNAL OF SELECTED TOPICS OF SIGNAL PROCESSING, VOL.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

Apr-30-2019

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)

Genre:
- Overview (0.86)
- Research Report (0.64)

Industry:
- Media > Music (1.00)
- Leisure & Entertainment (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (1.00)
  - Learning Graphical Models > Directed Networks
    - Bayesian Learning (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found