AITopics | Gómez-Zaragozá, Lucía

Collaborating Authors

Gómez-Zaragozá, Lucía

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EMOVOME Database: Advancing Emotion Recognition in Speech Beyond Staged Scenarios

Gómez-Zaragozá, Lucía, del Amor, Rocío, Castro-Bleda, María José, Naranjo, Valery, Raya, Mariano Alcañiz, Marín-Morales, Javier

arXiv.org Artificial IntelligenceJun-13-2024

Natural databases for Speech Emotion Recognition (SER) are scarce and often rely on staged scenarios, such as films or television shows, limiting their application in real-world contexts. We developed and publicly released the Emotional Voice Messages (EMOVOME) database, including 999 voice messages from real conversations of 100 Spanish speakers on a messaging app, labeled in continuous and discrete emotions by expert and non-expert annotators. We evaluated speaker-independent SER models using a standard set of acoustic features and transformer-based models. We compared the results with reference databases including acted and elicited speech, and analyzed the influence of annotators and gender fairness. The pre-trained UniSpeech-SAT-Large model achieved the highest results, 61.64% and 55.57% Unweighted Accuracy (UA) for 3-class valence and arousal prediction respectively on EMOVOME, a 10% improvement over baseline models. For the emotion categories, 42.58% UA was obtained. EMOVOME performed lower than the acted RAVDESS database. The elicited IEMOCAP database also outperformed EMOVOME in predicting emotion categories, while similar results were obtained in valence and arousal. EMOVOME outcomes varied with annotator labels, showing better results and fairness when combining expert and non-expert annotations. This study highlights the gap between staged and real-life scenarios, supporting further advancements in recognizing genuine emotions.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2403.02167

Country:

Europe > Spain (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Media (0.93)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.87)
(2 more...)

Add feedback

Alzheimer Disease Classification through ASR-based Transcriptions: Exploring the Impact of Punctuation and Pauses

Gómez-Zaragozá, Lucía, Wills, Simone, Tejedor-Garcia, Cristian, Marín-Morales, Javier, Alcañiz, Mariano, Strik, Helmer

arXiv.org Artificial IntelligenceJun-6-2023

Alzheimer's Disease (AD) is the world's leading neurodegenerative disease, which often results in communication difficulties. Analysing speech can serve as a diagnostic tool for identifying the condition. The recent ADReSS challenge provided a dataset for AD classification and highlighted the utility of manual transcriptions. In this study, we used the new state-of-the-art Automatic Speech Recognition (ASR) model Whisper to obtain the transcriptions, which also include automatic punctuation. The classification models achieved test accuracy scores of 0.854 and 0.833 combining the pretrained FastText word embeddings and recurrent neural networks on manual and ASR transcripts respectively. Additionally, we explored the influence of including pause information and punctuation in the transcriptions. We found that punctuation only yielded minor improvements in some cases, whereas pause encoding aided AD classification for both manual and ASR transcriptions across all approaches investigated.

artificial intelligence, machine learning, transcription, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2023-1734

2306.03443

Country: Europe > Spain (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback