AITopics | Vestman, Ville

Collaborating Authors

Vestman, Ville

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Extrapolating false alarm rates in automatic speaker verification

Sholokhov, Alexey, Kinnunen, Tomi, Vestman, Ville, Lee, Kong Aik

arXiv.org Machine LearningAug-8-2020

Automatic speaker verification (ASV) vendors and corpus In this study we improve upon the generative model presented providers would both benefit from tools to reliably extrapolate in [3]. Despite demonstrating expected overall trends, performance metrics for large speaker populations without collecting the predicted false alarm rates were substantially overestimated, new speakers. We address false alarm rate extrapolation particularly at high ASV thresholds (proxies of high-security under a worst-case model whereby an adversary identifies the applications). To tackle this shortcoming, we propose a discriminative closest impostor for a given target speaker from a large population.

acoustic processing, impostor, speech recognition, (21 more...)

arXiv.org Machine Learning

2008.0359

Country:

Europe > Finland (0.15)
Europe > Czechia (0.14)
Europe > Sweden (0.14)
(2 more...)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.86)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.72)

Add feedback

Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection

Kinnunen, Tomi, Hautamäki, Rosa González, Vestman, Ville, Sahidullah, Md

arXiv.org Machine LearningNov-9-2018

ABSTRACT We consider technology-assisted mimicry attacks in the context of automatic speaker verification (ASV). We use ASV itself to select targeted speakers to be attacked by human-based mimicry. We recorded 6 naive mimics for whom we select target celebrities from VoxCeleb1 and VoxCeleb2 corpora (7,365 potential targets) using an i-vector system. The attacker attempts to mimic the selected target, with the utterances subjected to ASV tests using an independently developed x-vector system. Our main finding is negative: even if some of the attacker scores against the target speakers were slightly increased, our mimics did not succeed in spoofing the x-vector system. Interestingly, however, the relative ordering of the selected targets (closest, furthest, median) are consistent between the systems, which suggests some level of transferability between the systems.

attacker, crowdsourcing, speech recognition, (20 more...)

arXiv.org Machine Learning

1811.0379

Country:

Europe (0.29)
Oceania > Australia (0.28)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Speech Recognition (0.41)

Add feedback

Supervector Compression Strategies to Speed up I-Vector System Development

Vestman, Ville, Kinnunen, Tomi

arXiv.org Machine LearningMay-3-2018

The front-end factor analysis (FEFA), an extension of principal component analysis (PPCA) tailored to be used with Gaussian mixture models (GMMs), is currently the prevalent approach to extract compact utterance-level features (i-vectors) for automatic speaker verification (ASV) systems. Little research has been conducted comparing FEFA to the conventional PPCA applied to maximum a posteriori (MAP) adapted GMM supervectors. We study several alternative methods, including PPCA, factor analysis (FA), and two supervised approaches, supervised PPCA (SPPCA) and the recently proposed probabilistic partial least squares (PPLS), to compress MAP-adapted GMM supervectors. The resulting i-vectors are used in ASV tasks with a probabilistic linear discriminant analysis (PLDA) back-end. We experiment on two different datasets, on the telephone condition of NIST SRE 2010 and on the recent VoxCeleb corpus collected from YouTube videos containing celebrity interviews recorded in various acoustical and technical conditions. The results suggest that, in terms of ASV accuracy, the supervector compression approaches are on a par with FEFA. The supervised approaches did not result in improved performance. In comparison to FEFA, we obtained more than hundred-fold (100x) speedups in the total variability model (TVM) training using the PPCA and FA supervector compression approaches.

artificial intelligence, machine learning, supervector, (15 more...)

arXiv.org Machine Learning

1805.01156

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback