Goto

Collaborating Authors

 early reflection


How much to Dereverberate? Low-Latency Single-Channel Speech Enhancement in Distant Microphone Scenarios

arXiv.org Artificial Intelligence

Dereverberation is an important sub-task of Speech Enhancement (SE) to improve the signal's intelligibility and quality. However, it remains challenging because the reverberation is highly correlated with the signal. Furthermore, the single-channel SE literature has predominantly focused on rooms with short reverb times (typically under 1 second), smaller rooms (under volumes of 1000 cubic meters) and relatively short distances (up to 2 meters). In this paper, we explore real-time low-latency single-channel SE under distant microphone scenarios, such as 5 to 10 meters, and focus on conference rooms and theatres, with larger room dimensions and reverberation times. Such a setup is useful for applications such as lecture demonstrations, drama, and to enhance stage acoustics. First, we show that single-channel SE in such challenging scenarios is feasible. Second, we investigate the relationship between room volume and reverberation time, and demonstrate its importance when randomly simulating room impulse responses. Lastly, we show that for dereverberation with short decay times, preserving early reflections before decaying the transfer function of the room improves overall signal quality.


A neural network-supported two-stage algorithm for lightweight dereverberation on hearing devices

arXiv.org Artificial Intelligence

A two-stage lightweight online dereverberation algorithm for hearing devices is presented in this paper. The approach combines a multi-channel multi-frame linear filter with a single-channel single-frame post-filter. Both components rely on power spectral density (PSD) estimates provided by deep neural networks (DNNs). By deriving new metrics analyzing the dereverberation performance in various time ranges, we confirm that directly optimizing for a criterion at the output of the multi-channel linear filtering stage results in a more efficient dereverberation as compared to placing the criterion at the output of the DNN to optimize the PSD estimation. More concretely, we show that training this stage end-to-end helps further remove the reverberation in the range accessible to the filter, thus increasing the early-to-moderate reverberation ratio. We argue and demonstrate that it can then be well combined with a post-filtering stage to efficiently suppress the residual late reverberation, thereby increasing the early-to-final reverberation ratio. This proposed two-stage procedure is shown to be both very effective in terms of dereverberation performance and computational demands, as compared to, e.g., recent state-of-the-art DNN approaches. Furthermore, the proposed two-stage system can be adapted to the needs of different types of hearing-device users by controlling the amount of reduction of early reflections.


Neural Fourier Shift for Binaural Speech Rendering

arXiv.org Artificial Intelligence

We present a neural network for rendering binaural speech from given monaural audio, position, and orientation of the source. Most of the previous works have focused on synthesizing binaural speeches by conditioning the positions and orientations in the feature space of convolutional neural networks. These synthesis approaches are powerful in estimating the target binaural speeches even for in-the-wild data but are difficult to generalize for rendering the audio from out-of-distribution domains. To alleviate this, we propose Neural Fourier Shift (NFS), a novel network architecture that enables binaural speech rendering in the Fourier space. Specifically, utilizing a geometric time delay based on the distance between the source and the receiver, NFS is trained to predict the delays and scales of various early reflections. NFS is efficient in both memory and computational cost, is interpretable, and operates independently of the source domain by its design. Experimental results show that NFS performs comparable to the previous studies on the benchmark dataset, even with its 25 times lighter memory and 6 times fewer calculations.


Five early reflections on the EU's proposed legal framework for AI

#artificialintelligence

As the use of AI accelerates around the world, policymakers are asking questions about what frameworks should guide the design and use of AI, and how it can benefit society. The EU is the first institution to take a major step to answer these questions through a proposed legal framework for AI released on 21 April 2021. In doing so, the EU is seeking to establish a safe environment for AI innovation and to position itself as a leader in setting "the global gold standard" for regulating AI. This is a positive aspect of the proposal as AI is a broad set of technology, tools and applications. Shifting the focus away from AI technology, which can have significantly different impacts depending on the application for which it is used, helps to mitigate the risk of divergent requirements for AI products and services.