open-domain chatbot
The Open-domain Paradox for Chatbots: Common Ground as the Basis for Human-like Dialogue
Skantze, Gabriel, Doğruöz, A. Seza
There is a surge in interest in the development of open-domain chatbots, driven by the recent advancements of large language models. The "openness" of the dialogue is expected to be maximized by providing minimal information to the users about the common ground they can expect, including the presumed joint activity. However, evidence suggests that the effect is the opposite. Asking users to "just chat about anything" results in a very narrow form of dialogue, which we refer to as the "open-domain paradox". In this position paper, we explain this paradox through the theory of common ground as the basis for human-like communication. Furthermore, we question the assumptions behind open-domain chatbots and identify paths forward for enabling common ground in human-computer dialogue.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (6 more...)
- Leisure & Entertainment (0.46)
- Information Technology (0.46)
- Consumer Products & Services (0.46)
Understanding Multi-Turn Toxic Behaviors in Open-Domain Chatbots
Chen, Bocheng, Wang, Guangjing, Guo, Hanqing, Wang, Yuanda, Yan, Qiben
Recent advances in natural language processing and machine learning have led to the development of chatbot models, such as ChatGPT, that can engage in conversational dialogue with human users. However, the ability of these models to generate toxic or harmful responses during a non-toxic multi-turn conversation remains an open research question. Existing research focuses on single-turn sentence testing, while we find that 82\% of the individual non-toxic sentences that elicit toxic behaviors in a conversation are considered safe by existing tools. In this paper, we design a new attack, \toxicbot, by fine-tuning a chatbot to engage in conversation with a target open-domain chatbot. The chatbot is fine-tuned with a collection of crafted conversation sequences. Particularly, each conversation begins with a sentence from a crafted prompt sentences dataset. Our extensive evaluation shows that open-domain chatbot models can be triggered to generate toxic responses in a multi-turn conversation. In the best scenario, \toxicbot achieves a 67\% activation rate. The conversation sequences in the fine-tuning stage help trigger the toxicity in a conversation, which allows the attack to bypass two defense methods. Our findings suggest that further research is needed to address chatbot toxicity in a dynamic interactive environment. The proposed \toxicbot can be used by both industry and researchers to develop methods for detecting and mitigating toxic responses in conversational dialogue and improve the robustness of chatbots for end users.
- Asia > China > Hong Kong (0.06)
- North America > United States > Michigan > Ingham County > Lansing (0.04)
- North America > United States > Michigan > Ingham County > East Lansing (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology > Security & Privacy (1.00)
- Education > Educational Setting > Online (0.46)
How "open" are the conversations with open-domain chatbots? A proposal for Speech Event based evaluation
Doğruöz, A. Seza, Skantze, Gabriel
Open-domain chatbots are supposed to converse freely with humans without being restricted to a topic, task or domain. However, the boundaries and/or contents of open-domain conversations are not clear. To clarify the boundaries of "openness", we conduct two studies: First, we classify the types of "speech events" encountered in a chatbot evaluation data set (i.e., Meena by Google) and find that these conversations mainly cover the "small talk" category and exclude the other speech event categories encountered in real life human-human communication. Second, we conduct a small-scale pilot study to generate online conversations covering a wider range of speech event categories between two humans vs. a human and a state-of-the-art chatbot (i.e., Blender by Facebook). A human evaluation of these generated conversations indicates a preference for human-human conversations, since the human-chatbot conversations lack coherence in most speech event categories. Based on these results, we suggest (a) using the term "small talk" instead of "open-domain" for the current chatbots which are not that "open" in terms of conversational abilities yet, and (b) revising the evaluation methods to test the chatbot conversations against other speech events.
- North America > United States (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > Belgium (0.04)
- Research Report (0.82)
- Personal > Interview (0.46)
Positively transitioned sentiment dialogue corpus for developing emotion-affective open-domain chatbots
Wang, Weixuan, Peng, Wei, Huang, Chong Hsuan, Wang, Haoran
In this paper, we describe a data enhancement method for developing Emily, an emotion-affective open-domain chatbot. The proposed method is based on explicitly modeling positively transitioned (PT) sentiment data from multi-turn dialogues. We construct a dialogue corpus with PT sentiment data and will release it for public use. By fine-tuning a pretrained dialogue model using the produced PT-enhanced dialogues, we are able to develop an emotion-affective open-domain chatbot exhibiting close-to-human performance in various emotion-affective metrics. We evaluate Emily against a few state-of-the-art (SOTA) open-domain chatbots and show the effectiveness of the proposed approach. The corpus is made publicly available.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (3 more...)
Assessing Political Prudence of Open-domain Chatbots
Bang, Yejin, Lee, Nayeon, Ishii, Etsuko, Madotto, Andrea, Fung, Pascale
Politically sensitive topics are still a challenge for open-domain chatbots. However, dealing with politically sensitive content in a responsible, non-partisan, and safe behavior way is integral for these chatbots. Currently, the main approach to handling political sensitivity is by simply changing such a topic when it is detected. This is safe but evasive and results in a chatbot that is less engaging. In this work, as a first step towards a politically safe chatbot, we propose a group of metrics for assessing their political prudence. We then conduct political prudence analysis of various chatbots and discuss their behavior from multiple angles Figure 1: Illustration of responses from different chatbots through our automatic metric and human in a political conversation. Abortion law is a topic evaluation metrics. The testsets and codebase that often leads to divisive political debates.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
- North America > United States > New York > New York County > New York City (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (2 more...)
- Government (0.69)
- Health & Medicine > Therapeutic Area (0.35)
Addressing Inquiries about History: An Efficient and Practical Framework for Evaluating Open-domain Chatbot Consistency
Li, Zekang, Zhang, Jinchao, Fei, Zhengcong, Feng, Yang, Zhou, Jie
A good open-domain chatbot should avoid presenting contradictory responses about facts or opinions in a conversational session, known as its consistency capacity. However, evaluating the consistency capacity of a chatbot is still challenging. Employing human judges to interact with chatbots on purpose to check their capacities is costly and low-efficient, and difficult to get rid of subjective bias. In this paper, we propose the Addressing Inquiries about History (AIH), an efficient and practical framework for the consistency evaluation. At the conversation stage, AIH attempts to address appropriate inquiries about the dialogue history to induce the chatbot to redeclare the historical facts or opinions. We carry out the conversation between chatbots, which is more efficient than the human-bot interaction and can also alleviate the subjective bias. In this way, we manage to rapidly obtain a dialog session that contains responses with high contradiction possibilities. At the contradiction recognition stage, we can either employ human judges or a natural language inference (NLI) model to recognize whether the answers to the inquiries are contradictory with history. Finally, we are able to rank chatbots according to the contradiction statistics. Experiments on open-domain chatbots show that our approach can efficiently and reliably assess the consistency capacity of chatbots and achieve a high ranking correlation with the human evaluation. We release the framework and hope to help improve the consistency capacity of chatbots. \footnote{\url{https://github.com/ictnlp/AIH}}
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York (0.05)
- Europe > Italy > Tuscany > Florence (0.04)
- (6 more...)
Chat about anything with Human-Like Open-Domain Chatbot
Most of today's chatbots are highly specific in their conversations (according to their domain of usage) and users can't afford to drift away from their expected use. They are not good with retaining context from past conversations, sometimes give meaningless, illogical responses and quite easily give the response, "I don't know". Open-domain chatbots are conversational agents that can chat about anything and have basic knowledge about the real world. In the research paper "Towards a Human-like Open-Domain Chatbot", Google introduced Meena. Meena is claimed to be the smartest chatbot, highly sensible and specific in its responses, unlike other chatbots.
GPT-3 & Beyond: 10 NLP Research Papers You Should Read
NLP research advances in 2020 are still dominated by large pre-trained language models, and specifically transformers. There were many interesting updates introduced this year that have made transformer architecture more efficient and applicable to long documents. Another hot topic relates to the evaluation of NLP models in different applications. We still lack evaluation approaches that clearly show where a model fails and how to fix it. Also, with the growing capabilities of language models such as GPT-3, conversational AI is enjoying a new wave of interest. Chatbots are improving, with several impressive bots like Meena and Blender introduced this year by top technology companies.
Leading AI & Machine Learning Research Trends 2021
To help you stay well prepared for 2021, we have summarized the latest trends across different research areas, including natural language processing, conversational AI, computer vision, and reinforcement learning. We also suggest key research papers in different areas that we think are representative of the latest advances. Subscribe to our AI Research mailing list at the bottom of this article to be alerted when we release new research articles. In 2020, NLP research advances were still dominated by large pre-trained language models, particularly transformers.This year we're likely to see some more interesting research ideas on improving the transformer architecture and the efficiency of its training. At the same time, we can be sure that top tech companies will continue to exploit the model's size as the main factor for improving the performance of language models, with GPT-4 or something similar likely to be introduced in 2021.
The Latest Breakthroughs in Conversational AI Agents
First, Google's chatbot Meena and Facebook's chatbot Blender demonstrated that dialog agents can achieve close to human-level performance in certain tasks. Then, OpenAI's GPT-3 model made lots of people wonder whether Artificial General Intelligence (AGI) is already here. While we are still a long way off true AGI, conversations with GPT-3 based chatbots can be very entertaining. Are you interested to learn more about the latest research breakthroughs in Conversational AI? Check out our premium research summaries covering open-domain chatbots, task-oriented chatbots, dialog datasets, and evaluation metrics. Subscribe to our AI Research mailing list at the bottom of this article to be alerted when we release new summaries.