facial action
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Multimodal Machine Learning Can Predict Videoconference Fluidity and Enjoyment
Chang, Andrew, Akkaraju, Viswadruth, Cogliano, Ray McFadden, Poeppel, David, Freeman, Dustin
Videoconferencing is now a frequent mode of communication in both professional and informal settings, yet it often lacks the fluidity and enjoyment of in-person conversation. This study leverages multimodal machine learning to predict moments of negative experience in videoconferencing. We sampled thousands of short clips from the RoomReader corpus, extracting audio embeddings, facial actions, and body motion features to train models for identifying low conversational fluidity, low enjoyment, and classifying conversational events (backchanneling, interruption, or gap). Our best models achieved an ROC-AUC of up to 0.87 on hold-out videoconference sessions, with domain-general audio features proving most critical. This work demonstrates that multimodal audio-video signals can effectively predict high-level subjective conversational outcomes. In addition, this is a contribution to research on videoconferencing user experience by showing that multimodal machine learning can be used to identify rare moments of negative user experience for further study or mitigation.
- North America > United States > New York (0.06)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- (2 more...)
Interpretable Video based Stress Detection with Self-Refine Chain-of-thought Reasoning
Stress detection is a critical area of research with significant implications for health monitoring and intervention systems. In this paper, we propose a novel interpretable approach for video-based stress detection, leveraging self-refine chain-of-thought reasoning to enhance both accuracy and transparency in decision-making processes. Our method focuses on extracting subtle behavioral and physiological cues from video sequences that indicate stress levels. By incorporating a chain-of-thought reasoning mechanism, the system refines its predictions iteratively, ensuring that the decision-making process can be traced and explained. The model also learns to self-refine through feedback loops, improving its reasoning capabilities over time. We evaluate our approach on several public and private datasets, demonstrating its superior performance in comparison to traditional video-based stress detection methods. Additionally, we provide comprehensive insights into the interpretability of the model's predictions, making the system highly valuable for applications in both healthcare and human-computer interaction domains.
Image Representations for Facial Expression Coding
The Facial Action Coding System (FACS) (9) is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The cod(cid:173) ing is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recog(cid:173) nizing facial actions in sequences of images. These methods include unsupervised learning techniques for finding basis images such as principal component analysis, independent component analysis and local feature analysis, and supervised learning techniques such as Fisher's linear discriminants.
Emotion recognition AI finding fans among lawyers swaying juries and potential clients
The American Bar Association has taken greater notice of emotional AI as a tool for honing courtroom and marketing performance. It is not clear if the storied group has caught up with the controversy that follows the comparatively new field. On the association's May 18 Legal Rebels podcast, ABA Journal legal affairs writer Victor Li speaks with the CEO of software startup EmotionTrac (a subsidiary of mobile ad tech firm Jinglz) about how an app first designed for the advertising industry reportedly has been adopted by dozens of attorneys. Aaron Itzkowitz is at pains to make clear the difference between facial recognition and affect recognition. At the moment, the use of face biometrics by governments is a growing controversy, and Li would like to stay separate from that debate.
AI system can detect nine different emotional states in FARM ANIMALS
An AI-powered computer system has been created which identifies the emotional state of farm animals and if they are happy or not. It is hoped that better understanding how animals are feeling can help improve their living conditions and quality of life. Thousands of images of cows and pigs from six farms around the world were used to train the network, called WUR Wolf, which was accurate 85 per cent of the time. An AI-powered computer system has been created which identifies the emotional state of farm animals and if they are happy or not. Pictured one of the images the system was trialled on which reveals a pig which was classified as'alert and neutral' Deep learning algorithms were used to identify 13 facial actions which included difference in an animal's ears, eyes and behaviour.,
A Scalable Approach for Facial Action Unit Classifier Training UsingNoisy Data for Pre-Training
To present a large set of automatically FACS-annotated images with gender, nationality and biographical meta-data. To propose a simple pipeline of pre-training and fine-tuning a CNN classifier in an end-to-end fashion for detecting the presence of facial action units that produces state-of-the-art performance. To conduct experiments to systematically investigate the effect of (1) the number of pre-training images and (2) the number of pre-training images of different people.
Linear Disentangled Representation Learning for Facial Actions
Limited annotated data available for the recognition of facial expression and action units embarrasses the training of deep networks, which can learn disentangled invariant features. However, a linear model with just several parameters normally is not demanding in terms of training data. In this paper, we propose an elegant linear model to untangle confounding factors in challenging realistic multichannel signals such as 2D face videos. The simple yet powerful model does not rely on huge training data and is natural for recognizing facial actions without explicitly disentangling the identity. Base on well-understood intuitive linear models such as Sparse Representation based Classification (SRC), previous attempts require a prepossessing of explicit decoupling which is practically inexact. Instead, we exploit the low-rank property across frames to subtract the underlying neutral faces which are modeled jointly with sparse representation on the action components with group sparsity enforced. On the extended Cohn-Kanade dataset (CK+), our one-shot automatic method on raw face videos performs as competitive as SRC applied on manually prepared action components and performs even better than SRC in terms of true positive rate. We apply the model to the even more challenging task of facial action unit recognition, verified on the MPI Face Video Database (MPI-VDB) achieving a decent performance. All the programs and data have been made publicly available.
- North America > United States > Maryland > Baltimore (0.04)
- North America > United States > Illinois (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
A Prototype for Automatic Recognition of Spontaneous Facial Actions
Bartlett, M.S., Littlewort, G.C., Sejnowski, T.J., Movellan, J.R.
Spontaneous facial expressions differ substantially from posed expressions, similar to how continuous, spontaneous speech differs from isolated words produced on command. Previous methods for automatic facial expression recognition assumed images were collected in controlled environments in which the subjects deliberately faced the camera. Since people often nod or turn their heads, automatic recognition of spontaneous facial behavior requires methods for handling out-of-image-plane head rotations. Here we explore an approach based on 3-D warping of images into canonical views. We evaluated the performance of the approach as a front-end for a spontaneous expression recognition system using support vector machines and hidden Markov models. This system employed general purpose learning mechanisms that can be applied to recognition of any facial movement. The system was tested for recognition of a set of facial actions defined by the Facial Action Coding System (FACS). We showed that 3D tracking and warping followed by machine learning techniques directly applied to the warped images, is a viable and promising technology for automatic facial expression recognition. One exciting aspect of the approach presented here is that information about movement dynamics emerged out of filters which were derived from the statistics of images.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)