AITopics | vocal burst

Collaborating Authors

vocal burst

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The NeurIPS 2023 Machine Learning for Audio Workshop: Affective Audio Benchmarks and Novel Data

Baird, Alice, Manzelli, Rachel, Tzirakis, Panagiotis, Gagne, Chris, Li, Haoqi, Allen, Sadie, Dieleman, Sander, Kulis, Brian, Narayanan, Shrikanth S., Cowen, Alan

arXiv.org Artificial IntelligenceMar-20-2024

The NeurIPS 2023 Machine Learning for Audio Workshop brings together machine learning (ML) experts from various audio domains. There are several valuable audio-driven ML tasks, from speech emotion recognition to audio event detection, but the community is sparse compared to other ML areas, e.g., computer vision or natural language processing. A major limitation with audio is the available data; with audio being a time-dependent modality, high-quality data collection is time-consuming and costly, making it challenging for academic groups to apply their often state-of-the-art strategies to a larger, more generalizable dataset. In this short white paper, to encourage researchers with limited access to large-datasets, the organizers first outline several open-source datasets that are available to the community, and for the duration of the workshop are making several propriety datasets available. Namely, three vocal datasets, Hume-Prosody, Hume-VocalBurst, an acted emotional speech dataset Modulate-Sonata, and an in-game streamer dataset Modulate-Stream. We outline the current baselines on these datasets but encourage researchers from across audio to utilize them outside of the initial baseline tasks.

dataset, emotion, vocal burst, (14 more...)

arXiv.org Artificial Intelligence

2403.14048

Country:

North America > United States > New York (0.05)
South America > Venezuela (0.04)
Africa > South Africa (0.04)
(8 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

EmoGator: A New Open Source Vocal Burst Dataset with Baseline Machine Learning Classification Methodologies

Buhl, Fred W.

arXiv.org Artificial IntelligenceApr-6-2023

Vocal Bursts -- short, non-speech vocalizations that convey emotions, such as laughter, cries, sighs, moans, and groans -- are an often-overlooked aspect of speech emotion recognition, but an important aspect of human vocal communication. One barrier to study of these interesting vocalizations is a lack of large datasets. I am pleased to introduce the EmoGator dataset, which consists of 32,130 samples from 357 speakers, 16.9654 hours of audio; each sample classified into one of 30 distinct emotion categories by the speaker. Several different approaches to construct classifiers to identify emotion categories will be discussed, and directions for future research will be suggested. Data set is available for download from https://github.com/fredbuhl/EmoGator.

artificial intelligence, category, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2301.00508

Country:

North America > United States > Florida > Orange County > Orlando (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition

Li, Jinchao, Wu, Xixin, Song, Kaitao, Li, Dongsheng, Liu, Xunying, Meng, Helen

arXiv.org Artificial IntelligenceMar-14-2023

As a common way of emotion signaling via non-linguistic vocalizations, vocal burst (VB) plays an important role in daily social interaction. Understanding and modeling human vocal bursts are indispensable for developing robust and general artificial intelligence. Exploring computational approaches for understanding vocal bursts is attracting increasing research attention. In this work, we propose a hierarchical framework, based on chain regression models, for affective recognition from VBs, that explicitly considers multiple relationships: (i) between emotional states and diverse cultures; (ii) between low-dimensional (arousal & valence) and high-dimensional (10 emotion classes) emotion spaces; and (iii) between various emotion classes within the high-dimensional space. To address the challenge of data sparsity, we also use self-supervised learning (SSL) representations with layer-wise and temporal aggregation modules. The proposed systems participated in the ACII Affective Vocal Burst (A-VB) Challenge 2022 and ranked first in the "TWO'' and "CULTURE'' tasks. Experimental results based on the ACII Challenge 2022 dataset demonstrate the superior performance of the proposed system and the effectiveness of considering multiple relationships using hierarchical regression chain models.

artificial intelligence, machine learning, recognition, (16 more...)

arXiv.org Artificial Intelligence

2303.08027

Country:

South America > Venezuela (0.05)
Africa > South Africa (0.05)
Asia > China > Hong Kong (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.72)

Add feedback

The ACII 2022 Affective Vocal Bursts Workshop & Competition: Understanding a critically understudied modality of emotional expression

Baird, Alice, Tzirakis, Panagiotis, Brooks, Jeffrey A., Gregory, Christopher B., Schuller, Björn, Batliner, Anton, Keltner, Dacher, Cowen, Alan

arXiv.org Artificial IntelligenceOct-27-2022

The ACII Affective Vocal Bursts Workshop & Competition is focused on understanding multiple affective dimensions of vocal bursts: laughs, gasps, cries, screams, and many other non-linguistic vocalizations central to the expression of emotion and to human communication more generally. This year's competition comprises four tracks using a large-scale and in-the-wild dataset of 59,299 vocalizations from 1,702 speakers. The first, the A-VB-High task, requires competition participants to perform a multi-label regression on a novel model for emotion, utilizing ten classes of richly annotated emotional expression intensities, including; Awe, Fear, and Surprise. The second, the A-VB-Two task, utilizes the more conventional 2-dimensional model for emotion, arousal, and valence. The third, the A-VB-Culture task, requires participants to explore the cultural aspects of the dataset, training native-country dependent models. Finally, for the fourth task, A-VB-Type, participants should recognize the type of vocal burst (e.g., laughter, cry, grunt) as an 8-class classification. This paper describes the four tracks and baseline systems, which use state-of-the-art machine learning methods. The baseline performance for each track is obtained by utilizing an end-to-end deep learning model and is as follows: for A-VB-High, a mean (over the 10-dimensions) Concordance Correlation Coefficient (CCC) of 0.5687 CCC is obtained; for A-VB-Two, a mean (over the 2-dimensions) CCC of 0.5084 is obtained; for A-VB-Culture, a mean CCC from the four cultures of 0.4401 is obtained; and for A-VB-Type, the baseline Unweighted Average Recall (UAR) from the 8-classes is 0.4172 UAR.

artificial intelligence, emotion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2207.03572

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York (0.05)
South America > Venezuela (0.05)
(9 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts

Anuchitanukul, Atijit, Specia, Lucia

arXiv.org Artificial IntelligenceOct-18-2022

We present Burst2Vec, our multi-task learning approach to predict emotion, age, and origin (i.e., native country/language) from vocal bursts. Burst2Vec utilises pre-trained speech representations to capture acoustic information from raw waveforms and incorporates the concept of model debiasing via adversarial training. Our models achieve a relative 30 % performance gain over baselines using pre-extracted features and score the highest amongst all participants in the ICML ExVo 2022 Multi-Task Challenge.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2206.12469

Country:

South America > Venezuela (0.05)
Asia > China (0.05)
Africa > South Africa (0.05)
(8 more...)

Genre: Research Report > Experimental Study (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

An Efficient Multitask Learning Architecture for Affective Vocal Burst Analysis

Hallmen, Tobias, Mertes, Silvan, Schiller, Dominik, André, Elisabeth

arXiv.org Artificial IntelligenceSep-28-2022

Affective speech analysis is an ongoing topic of research. A relatively new problem in this field is the analysis of vocal bursts, which are nonverbal vocalisations such as laughs or sighs. Current state-of-the-art approaches to address affective vocal burst analysis are mostly based on wav2vec2 or HuBERT features. In this paper, we investigate the use of the wav2vec successor data2vec in combination with a multitask learning pipeline to tackle different analysis problems at once. To assess the performance of our efficient multitask learning architecture, we participate in the 2022 ACII Affective Vocal Burst Challenge, showing that our approach substantially outperforms the baseline established there in three different subtasks.

artificial intelligence, machine learning, vocal burst, (14 more...)

arXiv.org Artificial Intelligence

2209.13914

Country:

Europe > Germany (0.05)
Europe > Portugal (0.04)
Asia > India > Telangana > Hyderabad (0.04)
Africa > Guinea-Bissau (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.81)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.68)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (0.61)

Add feedback

Self-Supervised Attention Networks and Uncertainty Loss Weighting for Multi-Task Emotion Recognition on Vocal Bursts

Karas, Vincent, Triantafyllopoulos, Andreas, Song, Meishu, Schuller, Björn W.

arXiv.org Artificial IntelligenceSep-27-2022

Vocal bursts play an important role in communicating affect, making them valuable for improving speech emotion recognition. Here, we present our approach for classifying vocal bursts and predicting their emotional significance in the ACII Affective Vocal Burst Workshop & Challenge 2022 (A-VB). We use a large self-supervised audio model as shared feature extractor and compare multiple architectures built on classifier chains and attention networks, combined with uncertainty loss weighting strategies. Our approach surpasses the challenge baseline by a wide margin on all four tasks.

artificial intelligence, arxiv preprint arxiv, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2209.07384

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Austria > Vienna (0.14)
Europe > Germany (0.05)
(8 more...)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

First ever interactive AUDIO map lets you HEAR emotions

Daily Mail - Science & techFeb-6-2019, 13:25:04 GMT

Scientists have found that involuntary sounds we make when we express shock, elation and fear reveal a lot more about what we feel than previously thought. An interactive audio map shows more than 2000 sounds for a range of 24 different emotions like fear, surprise (positive and negative), embarrassment, elation and ecstasy. The results are demonstrated in vivid sound and colour on the map allows you to move the cursor along it and hear the varying sounds. Spontaneous sounds like'woohoo' to convey excitement and'argh' to show anger say a lot more about what we're feeling than previously understood, according to new research by Berkeley University. Scientists conducted a statistical analysis of responses to more than 2,000 nonverbal exclamations known as'vocal bursts' to discover that there are thousands of different sounds for varying types of emotion.

artificial intelligence, embarrassment, emotion, (16 more...)

Daily Mail - Science & tech

Country: Asia (0.16)

Genre: Research Report > New Finding (0.70)

Industry: Health & Medicine > Therapeutic Area (0.31)

Technology: Information Technology > Artificial Intelligence > Robots (0.54)

Add feedback