Centroid-based deep metric learning for speaker recognition

Wang, Jixuan, Wang, Kuan-Chieh, Law, Marc, Rudzicz, Frank, Brudno, Michael

Feb-6-2019–arXiv.org Machine Learning

Then, a PLDA model is trained to measure thesimilarity of i-vectors. Replacing traditional i-vectors with speaker embedding models based on deep neural networks haslead to improvement in SV [4, 3]. Nonetheless, a PLDA classifier is still needed to compare the similarity of embeddings. More recently, end-to-end training of an embedding networkthat makes decision by comparing distance in the embedding to a cross-validated threshold outperformed traditional methods. For detailed comparison between embedding networksand i-vector based methods, we refer the reader to [6, 4, 3]. Building on top of these studies, our work focuses on the comparison between two different approaches for deep metric learning (TL [5, 6, 7, 8] and PNL [10]) for end-to-end speaker embedding models. Deep metric learning: End-to-end speaker embedding models can be seen as a form of deep metric learning, which has been widely studied in the machine learning literature. Early examples of metric learning with neural networks include signature[11] and face verification [12]. Both compare pairs of examples with standard similarity functions (e.g.

deep learning, neural network, triplet loss, (18 more...)

arXiv.org Machine Learning

Feb-6-2019

arXiv.org PDF

Add feedback

Country:
- North America > Canada > Ontario > Toronto (0.14)

Genre:
- Research Report (0.83)

Industry:
- Education (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found