AITopics | Doukhan, David

Collaborating Authors

Doukhan, David

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Semi-Automatic Approach to Create Large Gender- and Age-Balanced Speaker Corpora: Usefulness of Speaker Diarization & Identification

Uro, Rémi, Doukhan, David, Rilliard, Albert, Larcher, Laëtitia, Adgharouamane, Anissa-Claire, Tahon, Marie, Laurent, Antoine

arXiv.org Artificial IntelligenceApr-26-2024

This paper presents a semi-automatic approach to create a diachronic corpus of voices balanced for speaker's age, gender, and recording period, according to 32 categories (2 genders, 4 age ranges and 4 recording periods). Corpora were selected at French National Institute of Audiovisual (INA) to obtain at least 30 speakers per category (a total of 960 speakers; only 874 have be found yet). For each speaker, speech excerpts were extracted from audiovisual documents using an automatic pipeline consisting of speech detection, background music and overlapped speech removal and speaker diarization, used to present clean speaker segments to human annotators identifying target speakers. This pipeline proved highly effective, cutting down manual processing by a factor of ten. Evaluation of the quality of the automatic processing and of the final output is provided. It shows the automatic processing compare to up-to-date process, and that the output provides high quality speech for most of the selected excerpts. This method shows promise for creating large corpora of known target speakers.

artificial intelligence, machine learning, speech recognition, (19 more...)

arXiv.org Artificial Intelligence

2404.17552

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (1.00)
Media > Radio (0.46)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.68)

Add feedback

Evolution of Voices in French Audiovisual Media Across Genders and Age in a Diachronic Perspective

Rilliard, Albert, Doukhan, David, Uro, Rémi, Devauchelle, Simon

arXiv.org Artificial IntelligenceApr-24-2024

We present a diachronic acoustic analysis of the voice of 1023 speakers from French media archives. The speakers are spread across 32 categories based on four periods (years 1955/56, 1975/76, 1995/96, 2015/16), four age groups (20-35; 36-50; 51-65, >65), and two genders. The fundamental frequency ($F_0$) and the first four formants (F1-4) were estimated. Procedures used to ensure the quality of these estimations on heterogeneous data are described. From each speaker's $F_0$ distribution, the base-$F_0$ value was calculated to estimate the register. Average vocal tract length was estimated from formant frequencies. Base-$F_0$ and vocal tract length were fit by linear mixed models to evaluate how they may have changed across time periods and genders, corrected for age effects. Results show an effect of the period with a tendency to lower voices, independently of gender. A lowering of pitch is observed with age for female but not male speakers.

artificial intelligence, estimation, gender, (15 more...)

arXiv.org Artificial Intelligence

2404.16104

Country: South America > Brazil > Rio de Janeiro (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Voice Passing : a Non-Binary Voice Gender Prediction System for evaluating Transgender voice transition

Doukhan, David, Devauchelle, Simon, Girard-Monneron, Lucile, Ruz, Mía Chávez, Chaddouk, V., Wagner, Isabelle, Rilliard, Albert

arXiv.org Artificial IntelligenceApr-23-2024

This paper presents a software allowing to describe voices using a continuous Voice Femininity Percentage (VFP). This system is intended for transgender speakers during their voice transition and for voice therapists supporting them in this process. A corpus of 41 French cis- and transgender speakers was recorded. A perceptual evaluation allowed 57 participants to estimate the VFP for each voice. Binary gender classification models were trained on external gender-balanced data and used on overlapping windows to obtain average gender prediction estimates, which were calibrated to predict VFP and obtained higher accuracy than $F_0$ or vocal track length-based models. Training data speaking style and DNN architecture were shown to impact VFP estimation. Accuracy of the models was affected by speakers' age. This highlights the importance of style, age, and the conception of gender as binary or not, to build adequate statistical representations of cultural concepts.

artificial intelligence, gender, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2023-1835

2404.15176

Country:

South America > Brazil > Rio de Janeiro (0.14)
North America > United States > Minnesota (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards a Storytelling Humanoid Robot

AAAI ConferencesNov-5-2010

The useful This paper reports on the ongoing work done in the information is obviously multilevel. In this work we are GVLEX project. The aim of this multidisciplinary project not willing to design complete analysis for each level of is to design and test a storytelling humanoid robot. Ideally, interest but rather to design a multilevel analysis able to the robot would be able to process automatically a given point out the interesting parts of the tale. Based on the tale or short story, and to play it for a children audience.

artificial intelligence, information, prosody, (14 more...)

AAAI Conferences

2010 AAAI Fall Symposium Series

Country:

North America > United States > Indiana (0.15)
Asia > Japan > Honshū (0.15)

Genre: Research Report (0.75)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.64)

Add feedback