AITopics | Ringeval, Fabien

Collaborating Authors

Ringeval, Fabien

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Can GPT models Follow Human Summarization Guidelines? Evaluating ChatGPT and GPT-4 for Dialogue Summarization

Zhou, Yongxin, Ringeval, Fabien, Portet, François

arXiv.org Artificial IntelligenceOct-25-2023

This study explores the capabilities of prompt-driven Large Language Models (LLMs) like ChatGPT and GPT-4 in adhering to human guidelines for dialogue summarization. Experiments employed DialogSum (English social conversations) and DECODA (French call center interactions), testing various prompts: including prompts from existing literature and those from human summarization guidelines, as well as a two-step prompt approach. Our findings indicate that GPT models often produce lengthy summaries and deviate from human summarization guidelines. However, using human guidelines as an intermediate step shows promise, outperforming direct word-length constraint prompts in some cases. The results reveal that GPT models exhibit unique stylistic tendencies in their summaries. While BERTScores did not dramatically decrease for GPT outputs suggesting semantic similarity to human references and specialised pre-trained models, ROUGE scores reveal grammatical and lexical disparities between GPT-generated and human-written summaries. These findings shed light on the capabilities and limitations of GPT models in following human instructions for dialogue summarization.

large language model, machine learning, person2, (16 more...)

arXiv.org Artificial Intelligence

2310.1681

Country:

Europe > France (0.46)
North America > United States > Maine (0.15)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

Parcollet, Titouan, Nguyen, Ha, Evain, Solene, Boito, Marcely Zanon, Pupier, Adrien, Mdhaffar, Salima, Le, Hang, Alisamir, Sina, Tomashenko, Natalia, Dinarelli, Marco, Zhang, Shucong, Allauzen, Alexandre, Coavoux, Maximin, Esteve, Yannick, Rouvier, Mickael, Goulian, Jerome, Lecouteux, Benjamin, Portet, Francois, Rossato, Solange, Ringeval, Fabien, Schwab, Didier, Besacier, Laurent

arXiv.org Artificial IntelligenceSep-11-2023

Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing. Speech processing drastically benefitted from SSL as most of the current domain-related tasks are now being approached with pre-trained models. This work introduces LeBenchmark 2.0 an open-source framework for assessing and building SSL-equipped French speech technologies. It includes documented, large-scale and heterogeneous corpora with up to 14,000 hours of heterogeneous speech, ten pre-trained SSL wav2vec 2.0 models containing from 26 million to one billion learnable parameters shared with the community, and an evaluation protocol made of six downstream tasks to complement existing benchmarks. LeBenchmark 2.0 also presents unique perspectives on pre-trained SSL models for speech with the investigation of frozen versus fine-tuned downstream models, task-agnostic versus task-specific pre-trained models as well as a discussion on the carbon footprint of large-scale model training.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2309.05472

Country:

Asia (1.00)
Europe > France (0.46)
North America > United States (0.46)
(2 more...)

Genre: Research Report > Experimental Study (0.45)

Industry:

Media (1.00)
Health & Medicine (0.93)
Energy > Oil & Gas (0.67)
Energy > Power Industry (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Evaluating Emotional Nuances in Dialogue Summarization

Zhou, Yongxin, Ringeval, Fabien, Portet, François

arXiv.org Artificial IntelligenceJul-23-2023

Automatic dialogue summarization is a well-established task that aims to identify the most important content from human conversations to create a short textual summary. Despite recent progress in the field, we show that most of the research has focused on summarizing the factual information, leaving aside the affective content, which can yet convey useful information to analyse, monitor, or support human interactions. In this paper, we propose and evaluate a set of measures $PEmo$, to quantify how much emotion is preserved in dialog summaries. Results show that, summarization models of the state-of-the-art do not preserve well the emotional content in the summaries. We also show that by reducing the training set to only emotional dialogues, the emotional content is better preserved in the generated summaries, while conserving the most salient factual information.

computational linguistic, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2307.12371

Country:

Europe (1.00)
North America > United States > New York (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.47)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.47)

Add feedback

Dynamic Time-Alignment of Dimensional Annotations of Emotion using Recurrent Neural Networks

Alisamir, Sina, Ringeval, Fabien, Portet, Francois

arXiv.org Artificial IntelligenceSep-21-2022

Most automatic emotion recognition systems exploit time-continuous annotations of emotion to provide fine-grained descriptions of spontaneous expressions as observed in real-life interactions. As emotion is rather subjective, its annotation is usually performed by several annotators who provide a trace for a given dimension, i.e. a time-continuous series describing a dimension such as arousal or valence. However, annotations of the same expression are rarely consistent between annotators, either in time or in value, which adds bias and delay in the trace that is used to learn predictive models of emotion. We therefore propose a method that can dynamically compensate inconsistencies across annotations and synchronise the traces with the corresponding acoustic features using Recurrent Neural Networks. Experimental evaluations were carried on several emotion data sets that include Chinese, French, German, and Hungarian participants who interacted remotely in either noise-free conditions or in-the-wild. The results show that our method can significantly increase inter-annotator agreement, as well as correlation between traces and audio features, for both arousal and valence. In addition, improvements are obtained in the automatic prediction of these dimensions using simple light-weight models, especially for valence in noise-free conditions, and arousal for recordings captured in-the-wild.

annotation, artificial intelligence, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2209.10223

Country:

Europe (1.00)
Asia (0.93)
North America > United States (0.69)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech

Evain, Solene, Nguyen, Ha, Le, Hang, Boito, Marcely Zanon, Mdhaffar, Salima, Alisamir, Sina, Tong, Ziyi, Tomashenko, Natalia, Dinarelli, Marco, Parcollet, Titouan, Allauzen, Alexandre, Esteve, Yannick, Lecouteux, Benjamin, Portet, Francois, Rossato, Solange, Ringeval, Fabien, Schwab, Didier, Besacier, Laurent

arXiv.org Artificial IntelligenceJun-10-2021

Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image and natural language processing. Recent works also investigated SSL from speech. They were notably successful to improve performance on downstream tasks such as automatic speech recognition (ASR). While these works suggest it is possible to reduce dependence on labeled data for building efficient speech systems, their evaluation was mostly made on ASR and using multiple and heterogeneous experimental settings (most of them for English). This questions the objective comparison of SSL approaches and the evaluation of their impact on building speech systems. In this paper, we propose LeBenchmark: a reproducible framework for assessing SSL from speech. It not only includes ASR (high and low resource) tasks but also spoken language understanding, speech translation and emotion recognition. We also focus on speech technologies in a language different than English: French. SSL models of different sizes are trained from carefully sourced and documented datasets. Experiments show that SSL is beneficial for most but not all tasks which confirms the need for exhaustive and reliable benchmarks to evaluate its real impact. LeBenchmark is shared with the scientific community for reproducible research in SSL from speech.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2021-556

2104.11462

Country: Europe > France (0.30)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

AVEC 2019 Workshop and Challenge: State-of-Mind, Detecting Depression with AI, and Cross-Cultural Affect Recognition

Ringeval, Fabien, Schuller, Björn, Valstar, Michel, Cummins, NIcholas, Cowie, Roddy, Tavabi, Leili, Schmitt, Maximilian, Alisamir, Sina, Amiriparian, Shahin, Messner, Eva-Maria, Song, Siyang, Liu, Shuo, Zhao, Ziping, Mallol-Ragolta, Adria, Ren, Zhao, Soleymani, Mohammad, Pantic, Maja

arXiv.org Machine LearningJul-10-2019

The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the health and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of various approaches to health and emotion recognition from real-life data. This paper presents the major novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline systems on the three proposed tasks: state-of-mind recognition, depression assessment with AI, and cross-cultural affect sensing, respectively.

björn schuller, deep learning, speech recognition, (22 more...)

arXiv.org Machine Learning

1907.1151

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California > Los Angeles County (0.14)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

SEWA DB: A Rich Database for Audio-Visual Emotion and Sentiment Research in the Wild

Kossaifi, Jean, Walecki, Robert, Panagakis, Yannis, Shen, Jie, Schmitt, Maximilian, Ringeval, Fabien, Han, Jing, Pandit, Vedhas, Schuller, Bjorn, Star, Kam, Hajiyev, Elnar, Pantic, Maja

arXiv.org Artificial IntelligenceJan-9-2019

Natural human-computer interaction and audio-visual human behaviour sensing systems, which would achieve robust performance in-the-wild are more needed than ever as digital devices are becoming indispensable part of our life more and more. Accurately annotated real-world data are the crux in devising such systems. However, existing databases usually consider controlled settings, low demographic variability, and a single task. In this paper, we introduce the SEWA database of more than 2000 minutes of audio-visual data of 398 people coming from six cultures, 50% female, and uniformly spanning the age range of 18 to 65 years old. Subjects were recorded in two different contexts: while watching adverts and while discussing adverts in a video chat. The database includes rich annotations of the recordings in terms of facial landmarks, facial action units (FAU), various vocalisations, mirroring, and continuously valued valence, arousal, liking, agreement, and prototypic examples of (dis)liking. This database aims to be an extremely valuable resource for researchers in affective computing and automatic human sensing and is expected to push forward the research in human behaviour analysis, including cultural studies. Along with the database, we provide extensive baseline experiments for automatic FAU detection and automatic valence, arousal and (dis)liking intensity estimation.

annotation, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

1901.02839

Country:

Europe (1.00)
Asia (0.67)
North America > Canada > Quebec (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
(2 more...)

Add feedback