dialogue modeling
Advances in Multi-turn Dialogue Comprehension: A Survey
Training machines to understand natural language and interact with humans is an elusive and essential task in the field of artificial intelligence. In recent years, a diversity of dialogue systems has been designed with the rapid development of deep learning researches, especially the recent pre-trained language models. Among these studies, the fundamental yet challenging part is dialogue comprehension whose role is to teach the machines to read and comprehend the dialogue context before responding. In this paper, we review the previous methods from the perspective of dialogue modeling. We summarize the characteristics and challenges of dialogue comprehension in contrast to plain-text reading comprehension. Then, we discuss three typical patterns of dialogue modeling that are widely-used in dialogue comprehension tasks such as response selection and conversation question-answering, as well as dialogue-related language modeling techniques to enhance PrLMs in dialogue scenarios. Finally, we highlight the technical advances in recent years and point out the lessons we can learn from the empirical analysis and the prospects towards a new frontier of researches.
Toward Dialogue Modeling: A Semantic Annotation Scheme for Questions and Answers
Cruz-Blandรณn, Maria-Andrea, Minnema, Gosse, Nourbakhsh, Aria, Boritchev, Maria, Amblard, Maxime
The present study proposes an annotation scheme for classifying the content and discourse contribution of question-answer pairs. W e propose detailed guidelines for using the scheme and apply them to dialogues in English, Spanish, and Dutch. Finally, we report on initial machine learning experiments for automatic annotation.
Multi-turn Dialogue Response Generation with Autoregressive Transformer Models
Olabiyi, Oluwatobi, Mueller, Erik T.
Neural dialogue models, despite their successes, still suffer from lack of relevance, diversity, and in many cases coherence in their generated responses. These issues have been attributed to reasons including (1) short-range model architectures that capture limited temporal dependencies, (2) limitations of the maximum likelihood training objective, (3) the concave entropy profile of dialogue datasets resulting into short and generic responses, and (4) out-of-vocabulary problem leading to generation of a large number of $<$UNK$>$ tokens. Autoregressive transformer models such as GPT-2, although trained with the maximum likelihood objective, do not suffer from the out-of-vocabulary problem and have demonstrated an excellent ability to capture long-range structures in language modeling tasks. In this paper, we examine the use of autoregressive transformer models for multi-turn dialogue response generation. In our experiments, we employ small and medium GPT-2 models (with publicly available pretrained language model parameters) on the open-domain Movie Triples dataset and the closed-domain Ubuntu Dialogue dataset. The models (with and without pretraining) achieve significant improvements over the baselines for multi-turn dialogue response generation. They also produce state-of-the-art performance on the two datasets based on several metrics, including BLEU, ROGUE, and distinct n-gram.
Dialogue Modeling Via Hash Functions: Applications to Psychotherapy
Garg, Sahil, Cecchi, Guillermo, Rish, Irina, Gao, Shuyang, Bhaskar, Bhavana, Steeg, Greg Ver, Goyal, Palash, Galstyan, Aram
We propose a novel machine-learning framework for dialogue modeling which uses representations based on hash functions. More specifically, each person's response is represented by a binary hashcode where each bit reflects presence or absence of a certain text pattern in the response. Hashcodes serve as compressed text representations, allowing for efficient similarity search. Moreover, hashcode of one person's response can be used as a feature vector for predicting the hashcode representing another person's response. The proposed hashing model of dialogue is obtained by maximizing a novel lower bound on the mutual information between the hashcodes of consecutive responses. We apply our approach in psychotherapy domain, evaluating its effectiveness on a real-life dataset consisting of therapy sessions with patients suffering from depression.
Dialogue Generation With GAN
Su, Hui (The Hong Kong Polytechnic University) | Shen, Xiaoyu (Max Planck Institute Informatics) | Hu, Pengwei (The Hong Kong Polytechnic University) | Li, Wenjie (The Hong Kong Polytechnic University) | Chen, Yun ( The University of Hong Kong )
This paper presents a Generative Adversarial Network (GAN) to model multiturn dialogue generation, which trains a latent hierarchical recurrent encoder-decoder simultaneously with a discriminative classifier that make the prior approximate to the posterior. Experiments show that our model achieves better results.
Coherent Dialogue with Attention-Based Language Models
Mei, Hongyuan (Johns Hopkins University) | Bansal, Mohit (The University of North Carolina at Chapel Hill) | Walter, Matthew R. (Toyota Technological Institute at Chicago)
We model coherent conversation continuation via RNN-based dialogue models equipped with a dynamic attention mechanism. Our attention-RNN language model dynamically increases the scope of attention on the history as the conversation continues, as opposed to standard attention (or alignment) models with a fixed input scope in a sequence-to-sequence model. This allows each generated word to be associated with the most relevant words in its corresponding conversation history. We evaluate the model on two popular dialogue datasets, the open-domain MovieTriples dataset and the closed-domain Ubuntu Troubleshoot dataset, and achieve significant improvements over the state-of-the-art and baselines on several metrics, including complementary diversity-based metrics, human evaluation, and qualitative visualizations. We also show that a vanilla RNN with dynamic attention outperforms more complex memory models (e.g., LSTM and GRU) by allowing for flexible, long-distance memory. We promote further coherence via topic modeling-based reranking.
Jobs at x.ai x.ai
Our start-up began 2 years ago and we have successfully gone through 23M round-B funding. We have an awesome dataset, an awesome team of data scientists and an equally awesome NLP challenge. We have the ability to quickly produce labeled datasets and test novel NLP techniques, including semantic parsing, deep learning (Convolutional, Recurrent, Recursive neural nets), and various forms of dialogue modeling (e.g. We are looking for PhD-level candidates (or equivalent) with a strong background in either semantic parsers, entity extraction from text, or human-agent dialogue modeling. The candidate is expected to be able to design, implement and lead the evolution of one of our critical NLP tasks together with a team.