Goto

Collaborating Authors

 Oceania


Bimodal Speech Emotion Recognition Using Pre-Trained Language Models

arXiv.org Machine Learning

ABSTRACT Speech emotion recognition is a challenging task and an important step towards more natural human-machine interaction. We show that pre-trained language models can be fine-tuned for text emotion recognition, achieving an accuracy of 69 .5 % on Task 4A of SemEval 2017, improving upon the previous state of the art by over 3 % absolute. We combine these language models with speech emotion recognition, achieving results of 73. 5 % accuracy when using provided transcriptions and speech data on a subset of four classes of the IEMOCAP dataset. For our experiments, we created IEmoNet, a modular and adaptable bimodal framework for speech emotion recognition based on pre-trained language models. Lastly, we discuss the idea of using an emotional classifier as a reward for reinforcement learning as a step towards more successful and convenient human-machine interaction. Index T erms-- Speech Emotion Recognition, Text Emotion Recognition, Bimodal Emotion Recognition, IEMOCAP, Self Attention, Pre-trained Language Models 1. INTRODUCTION Emotions are an important aspect of human behavior. They do not only influence the reaction to our environment [1, 2], but also actively change our perception of it [3] and sometimes even contribute to how well we remember specific events [4]. As such, they influence both human-human and human-machine interaction. However, in human-machine interaction, emotions are often not at all or only scarcely considered.


A Reparameterization-Invariant Flatness Measure for Deep Neural Networks

arXiv.org Machine Learning

The performance of deep neural networks is often attributed to their automated, task-related feature construction. It remains an open question, though, why this leads to solutions with good generalization, even in cases where the number of parameters is larger than the number of samples. Back in the 90s, Hochreiter and Schmidhuber observed that flatness of the loss surface around a local minimum correlates with low generalization error. For several flatness measures, this correlation has been empirically validated. However, it has recently been shown that existing measures of flatness cannot theoretically be related to generalization due to a lack of invariance with respect to reparameterizations. We propose a natural modification of existing flatness measures that results in invariance to reparameterization.


Learning Likelihoods with Conditional Normalizing Flows

arXiv.org Machine Learning

Such behavior is desirable in multivariate structured prediction tasks, where handcrafted per-pixel loss-based methods inadequately capture strong correlations between output dimensions. CNFs are efficient in sampling and inference, they can be trained with a likelihood-based objective, and CNFs, being generative flows, do not suffer from mode collapse or training instabilities. We provide an effective method to train continuous CNFs for binary problems and in particular, we apply these CNFs to super-resolution and vessel segmentation tasks demonstrating competitive performance on standard benchmark datasets in terms of likelihood and conventional metrics. When the output y is high-dimensional this is a particularly challenging task, and the practitioner is left with many design choices. Do we factorize the conditional? If not, do we model correlations with, say, a conditional random field (Prince, 2012)? Do we use a unimodal distribution? How fat should the tails be? Do we use an explicit likelihood at all, or use implicit methods (Mohamed & Rezende, 2015) such as a GAN (Goodfellow et al., 2014)? Do we quantize the output?


Heuristic Strategies in Uncertain Approval Voting Environments

arXiv.org Artificial Intelligence

In many collective decision making situations, agents vote to choose an alternative that best represents the preferences of the group. Agents may manipulate the vote to achieve a better outcome by voting in a way that does not reflect their true preferences. In real world voting scenarios, people often do not have complete information about other voter preferences and it can be computationally complex to identify a strategy that will maximize their expected utility. In such situations, it is often assumed that voters will vote truthfully rather than expending the effort to strategize. However, being truthful is just one possible heuristic that may be used. In this paper, we examine the effectiveness of heuristics in single winner and multi-winner approval voting scenarios with missing votes. In particular, we look at heuristics where a voter ignores information about other voting profiles and makes their decisions based solely on how much they like each candidate. In a behavioral experiment, we show that people vote truthfully in some situations and prioritize high utility candidates in others. We examine when these behaviors maximize expected utility and show how the structure of the voting environment affects both how well each heuristic performs and how humans employ these heuristics.



Augmented reality wine labels see growing adoption

#artificialintelligence

The company's app and platform lets wineries and wine producers create, manage, and market an augmented reality experience for their own labels via a smartphone app. Now, says the company, 524 wineries from across the world are participating in the trial and accessing the marketing potential of a marriage of artificial intelligence and augmented reality. "I don't believe anything like this has been done to this level before – bringing together artificial intelligence, augmented reality, clever technology, 500 different wineries with different visuals and branding, as well as different languages," says app creator and Winerytale founder Dave Chaffey. "This platform is purpose-built for mass adoption and accessibility to any winery wanting to take advantage of a brand marketing and sales future that will undeniably involve augmented reality." The technology is designed to work on any wine label, using artificial intelligence to scan and recognize labels, and augmented reality to showcase the wine's backstory by beaming it from an imaginary space inside and outside of the bottle.


Dispelling the myth of the destructive artificial intelligence - The Market Herald

#artificialintelligence

One of the anxieties plaguing the work force as we enter the next decade is the fear of automation replacing blue collar jobs. At the forefront of these anxieties is artificial intelligence (AI). The discussion is even such a hot-pressed issue that American Democratic candidate Andrew Yang's argument for a universal basic income policy is riding on its ability to speak to middle-American manual labour workers. However, for a lot of consumers around the world, the thought of automation can be exciting -- such as letting a Tesla drive you hands free down the freeway on a long trip. But this doesn't create an exception for those that are worried their jobs will be replaced by an automated crane or self-driven truck.


SCL: SCL Irish Group event: "Exploring Bias in AI" (Breakfast meeting) - Thursday 28 November 2019, Dublin

#artificialintelligence

This event will be highly interactive. Dr Suzanne Little will give a short talk followed by group break-out sessions to discuss the topic in more depth. About the speaker: Dr Suzanne Little is Associate Professor and Senior Lecturer at the School of Computing, Dublin City University and SFI Principal Investigator, Insight Centre for Data Analytics. Before moving to the School of Computing at DCU in 2015, Suzanne was previously a Senior Research Fellow at the Insight Centre for Data Analytics at DCU. Suzanne originally joined the CLARITY research centre at Dublin City University in February 2012 and was principally responsible for the SAVASA project (Standards based Approach to Video Archive Search and Analysis). In 2013, CLARITY evolved to become Insight where Suzanne worked on and managed a number of projects in video analytics, motion analysis and data collection.


U-CNNpred: A Universal CNN-based Predictor for Stock Markets

arXiv.org Machine Learning

The performance of financial market prediction systems depends heavily on the quality of features it is using. While researchers have used various techniques for enhancing the stock specific features, less attention has been paid to extracting features that represent general mechanism of financial markets. In this paper, we investigate the importance of extracting such general features in stock market prediction domain and show how it can improve the performance of financial market prediction. We present a framework called U-CNNpred, that uses a CNN-based structure. A base model is trained in a specially designed layer-wise training procedure over a pool of historical data from many financial markets, in order to extract the common patterns from different markets. Our experiments, in which we have used hundreds of stocks in S\&P 500 as well as 14 famous indices around the world, show that this model can outperform baseline algorithms when predicting the directional movement of the markets for which it has been trained for. We also show that the base model can be fine-tuned for predicting new markets and achieve a better performance compared to the state of the art baseline algorithms that focus on constructing market-specific models from scratch.


Patch Reordering: a Novel Way to Achieve Rotation and Translation Invariance in Convolutional Neural Networks

arXiv.org Machine Learning

Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance on many visual recognition tasks. However, the combination of convolution and pooling operations only shows invariance to small local location changes in meaningful objects in input. Sometimes, such networks are trained using data augmentation to encode this invariance into the parameters, which restricts the capacity of the model to learn the content of these objects. A more efficient use of the parameter budget is to encode rotation or translation invariance into the model architecture, which relieves the model from the need to learn them. To enable the model to focus on learning the content of objects other than their locations, we propose to conduct patch ranking of the feature maps before feeding them into the next layer. When patch ranking is combined with convolution and pooling operations, we obtain consistent representations despite the location of meaningful objects in input. We show that the patch ranking module improves the performance of the CNN on many benchmark tasks, including MNIST digit recognition, large-scale image recognition, and image retrieval. The code is available at https://github.com//jasonustc/caffe-multigpu/tree/TICNN .