Wu, Youxi
DialogueLLM: Context and Emotion Knowledge-Tuned Large Language Models for Emotion Recognition in Conversations
Zhang, Yazhou, Wang, Mengyao, Wu, Youxi, Tiwari, Prayag, Li, Qiuchi, Wang, Benyou, Qin, Jing
Large language models (LLMs) and their variants have shown extraordinary efficacy across numerous downstream natural language processing (NLP) tasks, which has presented a new vision for the development of NLP. Despite their remarkable performance in natural language generating (NLG), LLMs lack a distinct focus on the emotion understanding domain. As a result, using LLMs for emotion recognition may lead to suboptimal and inadequate precision. Another limitation of LLMs is that they are typical trained without leveraging multi-modal information. To overcome these limitations, we propose DialogueLLM, a context and emotion knowledge tuned LLM that is obtained by fine-tuning LLaMA models with 13,638 multi-modal (i.e., texts and videos) emotional dialogues. The visual information is considered as the supplementary knowledge to construct high-quality instructions. We offer a comprehensive evaluation of our proposed model on three benchmarking emotion recognition in conversations (ERC) datasets and compare the results against the SOTA baselines and other SOTA LLMs. Additionally, DialogueLLM-7B can be easily trained using LoRA on a 40GB A100 GPU in 5 hours, facilitating reproducibility for other researchers.
Sequential three-way decisions with a single hidden layer feedforward neural network
Wu, Youxi, Cheng, Shuhui, Li, Yan, Lv, Rongjie, Min, Fan
They have been widely implemented in applications, including video frame inpainting [33] and automatic driving [31]. The performance of neural networks is mainly affected by hyperparameter selection and network topology. Hyperparameter selection [3, 4] is a classical topic in machine learning, which can be realized by grid search [26, 32] and particle swarm optimization [1, 24]. In addition, network topology [2, 30, 42] is the key of neural network design, which can be realized through three-way decisions [7] and an incremental learning mechanism [10, 15, 40]. To achieve an effective network structure, three-way decisions with a single hidden layer feedforward neural network (TWD-SFNN) [7] adopts a novel model to guide the number of hidden layer nodes. In addition, as a shallow neural network model, TWD-SFNN provides a new perspective for the topology design of multilayer neural networks, hence laying the theoretical foundation for the framework of deep learning. However, for practical applications, TWD-SFNN has two drawbacks: (i) in terms of the performance of TWD-SFNN, the generalization ability of TWD-SFNN needs to be further improved; and (ii) to analyze the relationship between the costs and number of hidden layer nodes more thoroughly, the process costs of TWD-SFNN need to be considered. To improve the generalization ability of neural networks on structured datasets, and further enrich the theoretical framework of deep learning, we employ sequential three-way decisions to guide the growth of the network topology.
OPP-Miner: Order-preserving sequential pattern mining
Wu, Youxi, Hu, Qian, Li, Yan, Guo, Lei, Zhu, Xingquan, Wu, Xindong
A time series is a collection of measurements in chronological order. Discovering patterns from time series is useful in many domains, such as stock analysis, disease detection, and weather forecast. To discover patterns, existing methods often convert time series data into another form, such as nominal/symbolic format, to reduce dimensionality, which inevitably deviates the data values. Moreover, existing methods mainly neglect the order relationships between time series values. To tackle these issues, inspired by order-preserving matching, this paper proposes an Order-Preserving sequential Pattern (OPP) mining method, which represents patterns based on the order relationships of the time series data. An inherent advantage of such representation is that the trend of a time series can be represented by the relative order of the values underneath the time series data. To obtain frequent trends in time series, we propose the OPP-Miner algorithm to mine patterns with the same trend (sub-sequences with the same relative order). OPP-Miner employs the filtration and verification strategies to calculate the support and uses pattern fusion strategy to generate candidate patterns. To compress the result set, we also study finding the maximal OPPs. Experiments validate that OPP-Miner is not only efficient and scalable but can also discover similar sub-sequences in time series. In addition, case studies show that our algorithms have high utility in analyzing the COVID-19 epidemic by identifying critical trends and improve the clustering performance.
Distant Supervision for E-commerce Query Segmentation via Attention Network
Li, Zhao, Ding, Donghui, Zou, Pengcheng, Gong, Yu, Chen, Xi, Zhang, Ji, Gao, Jianliang, Wu, Youxi, Duan, Yucong
The booming online e-commerce platforms demand highly accurate approaches to segment queries that carry the product requirements of consumers. Recent works have shown that the supervised methods, especially those based on deep learning, are attractive for achieving better performance on the problem of query segmentation. However, the lack of labeled data is still a big challenge for training a deep segmentation network, and the problem of Out-of-Vocabulary (OOV) also adversely impacts the performance of query segmentation. Different from query segmentation task in an open domain, e-commerce scenario can provide external documents that are closely related to these queries. Thus, to deal with the two challenges, we employ the idea of distant supervision and design a novel method to find contexts in external documents and extract features from these contexts. In this work, we propose a BiLSTM-CRF based model with an attention module to encode external features, such that external contexts information, which can be utilized naturally and effectively to help query segmentation. Experiments on two datasets show the effectiveness of our approach compared with several kinds of baselines.