Goto

Collaborating Authors

 Fung, Gabriel Pui Cheong


TopicRefine: Joint Topic Prediction and Dialogue Response Generation for Multi-turn End-to-End Dialogue System

arXiv.org Artificial Intelligence

A multi-turn dialogue always follows a specific topic thread, and topic shift at the discourse level occurs naturally as the conversation progresses, necessitating the model's ability to capture different topics and generate topic-aware responses. Previous research has either predicted the topic first and then generated the relevant response, or simply applied the attention mechanism to all topics, ignoring the joint distribution of the topic prediction and response generation models and resulting in uncontrollable and unrelated responses. In this paper, we propose a joint framework with a topic refinement mechanism to learn these two tasks simultaneously. Specifically, we design a three-pass iteration mechanism to generate coarse response first, then predict corresponding topics, and finally generate refined response conditioned on predicted topics. Moreover, we utilize GPT2DoubleHeads and BERT for the topic prediction task respectively, aiming to investigate the effects of joint learning and the understanding ability of GPT model. Experimental results demonstrate that our proposed framework achieves new state-of-the-art performance at response generation task and the great potential understanding capability of GPT model.


MCML: A Novel Memory-based Contrastive Meta-Learning Method for Few Shot Slot Tagging

arXiv.org Artificial Intelligence

Meta-learning is widely used for few-shot slot tagging in the task of few-shot learning. The performance of existing methods is, however, seriously affected by catastrophic forgetting. This phenomenon is common in deep learning as the training and testing modules fail to take into account historical information, i.e. previously trained episodes in the metric-based meta-learning. To overcome this predicament, we propose the Memory-based Contrastive Meta-learning (MCML) method. Specifically, we propose a learn-from-memory mechanism that use explicit memory to keep track of the label representations of previously trained episodes and propose a contrastive learning method to compare the current label embedded in the few shot episode with the historic ones stored in the memory, and an adaption-from memory mechanism to determine the output label based on the contrast between the input labels embedded in the test episode and the label clusters in the memory. Experimental results show that MCML is scalable and outperforms metric-based meta-learning and optimization-based meta-learning on all 1shot, 5-shot, 10-shot, and 20-shot scenarios of the SNIPS dataset.


KddRES: A Multi-level Knowledge-driven Dialogue Dataset for Restaurant Towards Customized Dialogue System

arXiv.org Artificial Intelligence

Compared with CrossWOZ (Chinese) and MultiWOZ (English) dataset which have coarse-grained information, there is no dataset which handle fine-grained and hierarchical level information properly. In this paper, we publish a first Cantonese knowledge-driven Dialogue Dataset for REStaurant (KddRES) in Hong Kong, which grounds the information in multi-turn conversations to one specific restaurant. Our corpus contains 0.8k conversations which derive from 10 restaurants with various styles in different regions. In addition to that, we designed fine-grained slots and intents to better capture semantic information. The benchmark experiments and data statistic analysis show the diversity and rich annotations of our dataset. We believe the publish of KddRES can be a necessary supplement of current dialogue datasets and more suitable and valuable for small and middle enterprises (SMEs) of society, such as build a customized dialogue system for each restaurant. The corpus and benchmark models are publicly available.


CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and Prediction with Multi-task Learning

arXiv.org Artificial Intelligence

This paper describes our system submitted to task 4 of SemEval 2020: Commonsense Validation and Explanation (ComVE) which consists of three sub-tasks. The challenge is to directly validate whether the system can recognize natural language statements that make sense from those that do not, and also require to generate reasonable explanation. Based on BERT architecture with multi-task setting, we propose an effective and interpretable "Explain, Reason and Predict" (ERP) system to solve the three sub-tasks about commonsense: (a) Validation, and (c) Explanation, (b) Reasoning, following the order of the competition. Inspired by cognitive studies of common sense, our system first generate a reason or understanding of the sentences and then choose which one statement makes sense, which is achieved by multi-task learning. The rational experiment validates our assumption and boost the performance. During the post-evaluation, our system has reached 92.9% accuracy in subtask A (rank 11), 89.7% accuracy in subtask B (rank 8), and BLEU score of 12.9 in subtask C (rank 9)


A Semi-Supervised Network Embedding Model for Protein Complexes Detection

AAAI Conferences

Protein complex is a group of associated polypeptide chains which plays essential roles in biological process. Given a graph representing protein-protein interactions (PPI) network, it is critical but non-trivial to detect protein complexes.In this paper, we propose a semi-supervised network embedding model by adopting graph convolutional networks to effectively detect densely connected subgraphs. We conduct extensive experiment on two popular PPI networks with various data sizes and densities. The experimental results show our approach achieves state-of-the-art performance.


A New Benchmark and Evaluation Schema for Chinese Typo Detection and Correction

AAAI Conferences

Despite the vast amount of research related to Chinese typo detection, we still lack a publicly available benchmark dataset for evaluation. Furthermore, no precise evaluation schema for Chinese typo detection has been defined. In response to these problems: (1) we release a benchmark dataset to assist research on Chinese typo correction; (2) we present an evaluation schema which was adopted in our NLPTEA 2017 Shared Task on Chinese Spelling Check; and (3) we report new improvements to our Chinese typo detection system ACT.