Tsinghua University
Inferring Emotion from Conversational Voice Data: A Semi-Supervised Multi-Path Generative Neural Network Approach
Zhou, Suping (Tsinghua University) | Jia, Jia (Tsinghua University) | Wang, Qi (Tsinghua University) | Dong, Yufei ( University of Science &) | Yin, Yufeng (Technology, Beijing ) | Lei, Kehua (Tsinghua University)
To give a more humanized response in Voice Dialogue Applications (VDAs), inferring emotion states from users’ queries may play an important role. However, in VDAs, we have tremendous amount of VDA users and massive scale of unlabeled data with high dimension features from multimodal information, which challenge the traditional speech emotion recognition methods. In this paper, to better infer emotion from conversational voice data, we proposed a semi-supervised multi-path generative neural network. Specifically, first, we build a novel supervised multi-path deep neural network framework. To avoid high dimensional input, raw features are trained by groups in local classifiers. Then high-level features of each local classifiers are concatenated as input of a global classifier. These two kinds classifiers are trained simultaneously through a single objective function to achieve a more effective and discriminative emotion inferring. To further solve the labeled-data-scarcity problem, we extend the multi-path deep neural network to a generative model based on semi-supervised variational autoencoder (semi-VAE), which is able to train the labeled and unlabeled data simultaneously. Experiment based on a 24,000 real-world dataset collected from Sogou Voice Assistant (SVAD13) and a benchmark dataset IEMOCAP show that our method significantly outperforms the existing state-of-the-art results.
Combining Machine Learning and Crowdsourcing for Better Understanding Commodity Reviews
Wu, Heting (Beihang University) | Sun, Hailong (Beihang University) | Fang, Yili (Beihang University) | Hu, Kefan (Beihang University) | Xie, Yongqing (Tsinghua University) | Song, Yangqiu ( University of Illinois ) | Liu, Xudong (Beihang University)
In e-commerce systems, customer reviews are important information for understanding market feedbacks on certain commodities. However, accurate analyzing reviews is challenging due to the complexity of natural language processing and informal descriptions in reviews. Existing methods mainly focus on studying efficient algorithms that cannot guarantee the accuracy for review analysis. Crowdsourcing can improve the accuracy of review analysis while it is subject to extra costs and low response time. In this work, we combine machine learning and crowdsourcing together for better understanding customer reviews. First, we collectively use multiple machine learning algorithms to pre-process review classification. Second, we select the reviews on which all machine learning algorithms cannot agree and assign them to humans to process. Third, the results from machine learning and crowdsourcing are aggregated to be the final analysis results. Finally, we perform real experiments with practical review data to confirm the effectiveness of our method.