Modelling black-box audio effects with time-varying feature modulation
Comunità, Marco, Steinmetz, Christian J., Phan, Huy, Reiss, Joshua D.
–arXiv.org Artificial Intelligence
ABSTRACT Deep learning approaches for black-box modelling of audio effects have shown promise, however, the majority of existing work focuses on nonlinear effects with behaviour on relatively short time-scales, such as guitar amplifiers and distortion. While recurrent and convolutional architectures can theoretically be extended to capture behaviour at longer time scales, we show that simply scaling the width, depth, or dilation factor of existing architectures does not result in satisfactory performance when modelling audio effects such as fuzz and dynamic range compression. We demonstrate Figure 1: State-of-the-art black-box models like GCN-3 [19] (grey) fail that our approach more accurately captures long-range dependencies to capture the behaviour of effects with large time constants such for a range of fuzz and compressor implementations across both time as fuzz (blue). Our proposed approach GCNTF-3 (orange), which and frequency domain metrics. However, distortion effects such as fuzz can also pose an additional challenge since they exhibit time-varying behaviour 1. INTRODUCTION Fuzz is characterised not only by asymmetrical clipping, Audio effects are tools employed by audio engineers and musicians which for sinusoidal inputs results in a rectangular wave output, but central to shaping the timbre, dynamics, and spatialisation of also for its attack and release time constants which modulate the behaviour sound [1].
arXiv.org Artificial Intelligence
May-9-2023
- Country:
- Europe > United Kingdom
- England > Greater London > London (0.04)
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.04)
- South America > Chile
- Europe > United Kingdom
- Genre:
- Research Report (1.00)
- Industry:
- Transportation > Air (0.82)
- Technology: