Goto

Collaborating Authors

 Boureau, Y-Lan


Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts

arXiv.org Artificial Intelligence

Many cognitive approaches to well-being, such as recognizing and reframing unhelpful thoughts, have received considerable empirical support over the past decades, yet still lack truly widespread adoption in self-help format. A barrier to that adoption is a lack of adequately specific and diverse dedicated practice material. This work examines whether current language models can be leveraged to both produce a virtually unlimited quantity of practice material illustrating standard unhelpful thought patterns matching specific given contexts, and generate suitable positive reframing proposals. We propose PATTERNREFRAME, a novel dataset of about 10k examples of thoughts containing unhelpful thought patterns conditioned on a given persona, accompanied by about 27k positive reframes. By using this dataset to train and/or evaluate current models, we show that existing models can already be powerful tools to help generate an abundance of tailored practice material and hypotheses, with no or minimal additional model training required.


Improving Open Language Models by Learning from Organic Interactions

arXiv.org Artificial Intelligence

We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety. We are publicly releasing the participating de-identified interaction data for use by the research community, in order to spur further progress. Training models with organic data is challenging because interactions with people "in the wild" include both high quality conversations and feedback, as well as adversarial and toxic behavior. We study techniques that enable learning from helpful teachers while avoiding learning from people who are trying to trick the model into unhelpful or toxic responses. BlenderBot 3x is both preferred in conversation to BlenderBot 3, and is shown to produce safer responses in challenging situations. While our current models are still far from perfect, we believe further improvement can be achieved by continued use of the techniques explored in this work.


The HCI Aspects of Public Deployment of Research Chatbots: A User Study, Design Recommendations, and Open Challenges

arXiv.org Artificial Intelligence

Publicly deploying research chatbots is a nuanced topic involving necessary risk-benefit analyses. While there have recently been frequent discussions on whether it is responsible to deploy such models, there has been far less focus on the interaction paradigms and design approaches that the resulting interfaces should adopt, in order to achieve their goals more effectively. We aim to pose, ground, and attempt to answer HCI questions involved in this scope, by reporting on a mixed-methods user study conducted on a recent research chatbot. We find that abstract anthropomorphic representation for the agent has a significant effect on user's perception, that offering AI explainability may have an impact on feedback rates, and that two (diegetic and extradiegetic) levels of the chat experience should be intentionally designed. We offer design recommendations and areas of further focus for the research community.


Detecting Inspiring Content on Social Media

arXiv.org Artificial Intelligence

Our work aims to facilitate by Thrash and Elliot as possessing three core such encounters by providing tools for automatic identification characteristics: evocation (i.e., it is triggered rather than of text content likely to be judged inspiring. We focus on willed), transcendence (i.e., it orients towards things outside inspiration in everyday content as judged by lay people, similar of and greater than the self), and approach motivation (i.e., it in spirit to early work by Hart who attempted to capture the energizes approach rather than avoidance [1]-[3]). Inspiration experience of inspiration in ordinary life [5], rather than "as if has two distinct stages: one an activation state that is more akin it were reserved for the gifted artist, the breakthrough scientist, to feeling and emotion, the second an urge to act.


Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents

arXiv.org Artificial Intelligence

At the heart of improving conversational AI is the open problem of how to evaluate conversations. Issues with automatic metrics are well known (Liu et al., 2016, arXiv:1603.08023), with human evaluations still considered the gold standard. Unfortunately, how to perform human evaluations is also an open problem: differing data collection methods have varying levels of human agreement and statistical sensitivity, resulting in differing amounts of human annotation hours and labor costs. In this work we compare five different crowdworker-based human evaluation methods and find that different methods are best depending on the types of models compared, with no clear winner across the board. While this highlights the open problems in the area, our analysis leads to advice of when to use which one, and possible future directions.


SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures

arXiv.org Artificial Intelligence

Current open-domain conversational models can easily be made to talk in inadequate ways. Online learning from conversational feedback given by the conversation partner is a promising avenue for a model to improve and adapt, so as to generate fewer of these safety failures. However, current state-of-the-art models tend to react to feedback with defensive or oblivious responses. This makes for an unpleasant experience and may discourage conversation partners from giving feedback in the future. This work proposes SaFeRDialogues, a task and dataset of graceful responses to conversational feedback about safety failures. We collect a dataset of 10k dialogues demonstrating safety failures, feedback signaling them, and a response acknowledging the feedback. We show how fine-tuning on this dataset results in conversations that human raters deem considerably more likely to lead to a civil conversation, without sacrificing engagingness or general conversational ability.


Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

arXiv.org Artificial Intelligence

Over the last several years, end-to-end neural conversational agents have vastly improved in their ability to carry a chit-chat conversation with humans. However, these models are often trained on large datasets from the internet, and as a result, may learn undesirable behaviors from this data, such as toxic or otherwise harmful language. Researchers must thus wrestle with the issue of how and when to release these models. In this paper, we survey the problem landscape for safety for end-to-end conversational AI and discuss recent and related work. We highlight tensions between values, potential positive impact and potential harms, and provide a framework for making decisions about whether and how to release these models, following the tenets of value-sensitive design. We additionally provide a suite of tools to enable researchers to make better-informed decisions about training and releasing end-to-end conversational AI models.


Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness

arXiv.org Artificial Intelligence

Open-domain dialogue agents have vastly improved, but still confidently hallucinate knowledge or express doubt when asked straightforward questions. In this work, we analyze whether state-of-the-art chit-chat models can express metacognition capabilities through their responses: does a verbalized expression of doubt (or confidence) match the likelihood that the model's answer is incorrect (or correct)? We find that these models are poorly calibrated in this sense, yet we show that the representations within the models can be used to accurately predict likelihood of correctness. By incorporating these correctness predictions into the training of a controllable generation model, we obtain a dialogue agent with greatly improved linguistic calibration.


Recipes for Safety in Open-domain Chatbots

arXiv.org Artificial Intelligence

Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases. We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. We introduce a new human-and-model-in-the-loop framework for both training safer models and for evaluating them, as well as a novel method to distill safety considerations inside generative models without the use of an external classifier at deployment time. We conduct experiments comparing these methods and find our new techniques are (i) safer than existing models as measured by automatic and human evaluations while (ii) maintaining usability metrics such as engagingness relative to the state of the art. We then discuss the limitations of this work by analyzing failure cases of our models.


Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

arXiv.org Artificial Intelligence

Further, we discuss only open academic research with entertaining wit and knowledge while making others feel reproducible published results, hence we will not address heard. The breadth of possible conversation topics and lack much of the considerable work that has been put into building of a well-defined objective make it challenging to define a commercial systems, where methods, data and results roadmap towards training a good conversational agent, or are not in the public domain. Finally, given that we focus on chatbot. Despite recent progress across the board (Adiwardana open-domain conversation, we do not focus on specific goaloriented et al., 2020; Roller et al., 2020), conversational agents techniques; we also do not cover spoken dialogue in are still incapable of carrying an open-domain conversation this work, focusing on text and image input/output only. For that remains interesting, consistent, accurate, and reliably more general recent surveys, see Gao et al. (2019); Jurafsky well-behaved (e.g., not offensive) while navigating a variety and Martin (2019); Huang, Zhu, and Gao (2020). of topics. Traditional task-oriented dialogue systems rely on slotfilling and structured modules (e.g., Young et al. (2013); Gao et al. (2019); Jurafsky and Martin (2019)).