Goto

Collaborating Authors

 emotion recognition system


Teaching AI to Feel: A Collaborative, Full-Body Exploration of Emotive Communication

Tütüncü, Esen K., Lemus, Lissette, Pilcher, Kris, Sprengel, Holger, Sabater-Mir, Jordi

arXiv.org Artificial Intelligence

Commonaiverse is an interactive installation exploring human emotions through full-body motion tracking and real-time AI feedback. Participants engage in three phases: Teaching, Exploration and the Cosmos Phase, collaboratively expressing and interpreting emotions with the system. The installation integrates MoveNet for precise motion tracking and a multi-recommender AI system to analyze emotional states dynamically, responding with adaptive audiovisual outputs. By shifting from top-down emotion classification to participant-driven, culturally diverse definitions, we highlight new pathways for inclusive, ethical affective computing. We discuss how this collaborative, out-of-the-box approach pushes multimedia research beyond single-user facial analysis toward a more embodied, co-created paradigm of emotional AI. Furthermore, we reflect on how this reimagined framework fosters user agency, reduces bias, and opens avenues for advanced interactive applications.


Consumer-friendly EEG-based Emotion Recognition System: A Multi-scale Convolutional Neural Network Approach

Ly, Tri Duc, Ngo, Gia H.

arXiv.org Artificial Intelligence

EEG is a non-invasive, safe, and low-risk method to record electrophysiological signals inside the brain. Especially with recent technology developments like dry electrodes, consumer-grade EEG devices, and rapid advances in machine learning, EEG is commonly used as a resource for automatic emotion recognition. With the aim to develop a deep learning model that can perform EEG-based emotion recognition in a real-life context, we propose a novel approach to utilize multi-scale convolutional neural networks to accomplish such tasks. By implementing feature extraction kernels with many ratio coefficients as well as a new type of kernel that learns key information from four separate areas of the brain, our model consistently outperforms the state-of-the-art TSception model in predicting valence, arousal, and dominance scores across many performance evaluation metrics.


Multi-face emotion detection for effective Human-Robot Interaction

Yahyaoui, Mohamed Ala, Oujabour, Mouaad, Letaifa, Leila Ben, Bohi, Amine

arXiv.org Artificial Intelligence

The integration of dialogue interfaces in mobile devices has become ubiquitous, providing a wide array of services. As technology progresses, humanoid robots designed with human-like features to interact effectively with people are gaining prominence, and the use of advanced human-robot dialogue interfaces is continually expanding. In this context, emotion recognition plays a crucial role in enhancing human-robot interaction by enabling robots to understand human intentions. This research proposes a facial emotion detection interface integrated into a mobile humanoid robot, capable of displaying real-time emotions from multiple individuals on a user interface. To this end, various deep neural network models for facial expression recognition were developed and evaluated under consistent computer-based conditions, yielding promising results. Afterwards, a trade-off between accuracy and memory footprint was carefully considered to effectively implement this application on a mobile humanoid robot.


Complex Emotion Recognition System using basic emotions via Facial Expression, EEG, and ECG Signals: a review

Joloudari, Javad Hassannataj, Maftoun, Mohammad, Nakisa, Bahareh, Alizadehsani, Roohallah, Yadollahzadeh-Tabari, Meisam

arXiv.org Artificial Intelligence

The Complex Emotion Recognition System (CERS) deciphers complex emotional states by examining combinations of basic emotions expressed, their interconnections, and the dynamic variations. Through the utilization of advanced algorithms, CERS provides profound insights into emotional dynamics, facilitating a nuanced understanding and customized responses. The attainment of such a level of emotional recognition in machines necessitates the knowledge distillation and the comprehension of novel concepts akin to human cognition. The development of AI systems for discerning complex emotions poses a substantial challenge with significant implications for affective computing. Furthermore, obtaining a sizable dataset for CERS proves to be a daunting task due to the intricacies involved in capturing subtle emotions, necessitating specialized methods for data collection and processing. Incorporating physiological signals such as Electrocardiogram (ECG) and Electroencephalogram (EEG) can notably enhance CERS by furnishing valuable insights into the user's emotional state, enhancing the quality of datasets, and fortifying system dependability. A comprehensive literature review was conducted in this study to assess the efficacy of machine learning, deep learning, and meta-learning approaches in both basic and complex emotion recognition utilizing EEG, ECG signals, and facial expression datasets. The chosen research papers offer perspectives on potential applications, clinical implications, and results of CERSs, with the objective of promoting their acceptance and integration into clinical decision-making processes. This study highlights research gaps and challenges in understanding CERSs, encouraging further investigation by relevant studies and organizations. Lastly, the significance of meta-learning approaches in improving CERS performance and guiding future research endeavors is underscored.


What Does it Take to Generalize SER Model Across Datasets? A Comprehensive Benchmark

Ibrahim, Adham, Shehata, Shady, Kulkarni, Ajinkya, Mohamed, Mukhtar, Abdul-Mageed, Muhammad

arXiv.org Artificial Intelligence

Speech emotion recognition (SER) is essential for enhancing human-computer interaction in speech-based applications. Despite improvements in specific emotional datasets, there is still a research gap in SER's capability to generalize across real-world situations. In this paper, we investigate approaches to generalize the SER system across different emotion datasets. In particular, we incorporate 11 emotional speech datasets and illustrate a comprehensive benchmark on the SER task. We also address the challenge of imbalanced data distribution using over-sampling methods when combining SER datasets for training. Furthermore, we explore various evaluation protocols for adeptness in the generalization of SER. Building on this, we explore the potential of Whisper for SER, emphasizing the importance of thorough evaluation. Our approach is designed to advance SER technology by integrating speaker-independent methods.


Persian Speech Emotion Recognition by Fine-Tuning Transformers

Shayaninasab, Minoo, Babaali, Bagher

arXiv.org Artificial Intelligence

Given the significance of speech emotion recognition, numerous methods have been developed in recent years to create effective and efficient systems in this domain. One of these methods involves the use of pretrained transformers, fine-tuned to address this specific problem, resulting in high accuracy. Despite extensive discussions and global-scale efforts to enhance these systems, the application of this innovative and effective approach has received less attention in the context of Persian speech emotion recognition. In this article, we review the field of speech emotion recognition and its background, with an emphasis on the importance of employing transformers in this context. We present two models, one based on spectrograms and the other on the audio itself, fine-tuned using the shEMO dataset. These models significantly enhance the accuracy of previous systems, increasing it from approximately 65% to 80% on the mentioned dataset. Subsequently, to investigate the effect of multilinguality on the fine-tuning process, these same models are fine-tuned twice. First, they are fine-tuned using the English IEMOCAP dataset, and then they are fine-tuned with the Persian shEMO dataset. This results in an improved accuracy of 82% for the Persian emotion recognition system. Keywords: Persian Speech Emotion Recognition, shEMO, Self-Supervised Learning


Synthesizing Affective Neurophysiological Signals Using Generative Models: A Review Paper

Nia, Alireza F., Tang, Vanessa, Talou, Gonzalo Maso, Billinghurst, Mark

arXiv.org Artificial Intelligence

The integration of emotional intelligence in machines is an important step in advancing human-computer interaction. This demands the development of reliable end-to-end emotion recognition systems. However, the scarcity of public affective datasets presents a challenge. In this literature review, we emphasize the use of generative models to address this issue in neurophysiological signals, particularly Electroencephalogram (EEG) and Functional Near-Infrared Spectroscopy (fNIRS). We provide a comprehensive analysis of different generative models used in the field, examining their input formulation, deployment strategies, and methodologies for evaluating the quality of synthesized data. This review serves as a comprehensive overview, offering insights into the advantages, challenges, and promising future directions in the application of generative models in emotion recognition systems. Through this review, we aim to facilitate the progression of neurophysiological data augmentation, thereby supporting the development of more efficient and reliable emotion recognition systems.


Progress in Emotion Recognition part1(Computer Vision)

#artificialintelligence

Abstract: Couples generally manage chronic diseases together and the management takes an emotional toll on both patients and their romantic partners. Consequently, recognizing the emotions of each partner in daily life could provide an insight into their emotional well-being in chronic disease management. The emotions of partners are currently inferred in the lab and daily life using self-reports which are not practical for continuous emotion assessment or observer reports which are manual, time-intensive, and costly. Currently, there exists no comprehensive overview of works on emotion recognition among couples. Furthermore, approaches for emotion recognition among couples have (1) focused on English-speaking couples in the U.S., (2) used data collected from the lab, and (3) performed recognition using observer ratings rather than partner's self-reported / subjective emotions.


Artificial emotional intelligence: a safer, smarter future with 5G and emotion recognition

#artificialintelligence

With the advent of 5G communication technology and its integration with AI, we are looking at the dawn of a new era in which people, machines, objects, and devices are connected like never before. This smart era will be characterized by smart facilities and services such as self-driving cars, smart UAVs, and intelligent healthcare. This will be the aftermath of a technological revolution. But the flip side of such technological revolution is that AI itself can be used to attack or threaten the security of 5G-enabled systems which, in turn, can greatly compromise their reliability. It is, therefore, imperative to investigate such potential security threats and explore countermeasures before a smart world is realized.


An online game to raise awareness on AI Emotion Recognition.

#artificialintelligence

Identifying what someone is feeling or even anticipating potential reactions based on nonverbal behavioral cues is no longer a problem reserved for sensitive and astute people. With the advancement of cutting-edge technologies in emotional intelligence, this capability gains new dimensions with the capability of machines recognizing human emotions for a variety of purposes. Complex facial detection algorithms are now powerful enough to analyze and measure emotions captured in real-world situations. They are so powerful that we are reaching a point that some ethical aspects have been raised. Emotion Recognition is based on facial expression recognition, a computer-based technology that employs algorithms to detect faces, code facial expressions, and recognize emotional states in real-time.