Speaker and Language Change Detection using Wav2vec2 and Whisper

Berns, Tijn, Vaessen, Nik, van Leeuwen, David A.

Feb-18-2023–arXiv.org Artificial Intelligence

A penalty was needed to compensate for the difference in the number of parameters, but tuning the weight of this penalty was We investigate recent transformer networks pre-trained for automatic considered a weakness, that [3] cleverly circumvented by fixing speech recognition for their ability to detect speaker the number of model parameters when going from a single and language changes in speech. We do this by simply to two models. In the neural era, [4] applied an LSTM for the adding speaker (change) or language targets to the labels. For sole task of SCD, labelling individual frames with a speaker Wav2vec2 pre-trained networks, we also investigate if the representation change boolean, after convolving the single speaker change labels for the speaker change symbol can be conditioned to with a unit block function to account for class imbalance.

artificial intelligence, machine learning, speaker change, (17 more...)

arXiv.org Artificial Intelligence

Feb-18-2023

arXiv.org PDF

Add feedback

Country:
- Europe
  - Netherlands > Gelderland
    - Nijmegen (0.04)
  - Hungary > Budapest
    - Budapest (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found