Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation

Oct-26-2022–arXiv.org Artificial Intelligence

FBANK, as a Single channel target speaker separation (TSS) aims at extracting simple signal processing method, has been ignored as an enrollment a speaker's voice from a mixture of multiple talkers given an enrollment option in previous literature. SSL are a class of powerful models that utterance of that speaker. A typical deep learning TSS learn problem-agnostic speech features from unlabelled data [12-framework consists of an upstream model that obtains enrollment 14], and we hypothesize that such broader information (compared to speaker embeddings and a downstream model that performs the separation SID) could benefit TSS enrollment. Note that, unlike [15], which conditioned on the embeddings. In this paper, we look into uses SSL as the input mixture features for blind speaker separation, several important but overlooked aspects of the enrollment embeddings, we limit SSL to offline processing the enrollment utterance, since including the suitability of the widely used speaker identification TSS often requires real-time low-complexity processing for the mixtures embeddings, the introduction of the log-mel filterbank and selfsupervised [2-5]. Finally, we consider a cross-dataset evaluation to assess embeddings, and the embeddings' cross-dataset generalization the generalization of the enrollment embeddings [16], which is another capability. Our results show that the speaker identification important but overlooked aspect in previous TSS research.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Oct-26-2022

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Speech (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found