Dinan, Emily
BTS: Harmonizing Specialized Experts into a Generalist LLM
Zhang, Qizhen, Bhargava, Prajjwal, Bi, Chloe, Cai, Chris X., Foerster, Jakob, Fu, Jeremy, Koura, Punit Singh, Silva, Ruan, Shen, Sheng, Dinan, Emily, Gururangan, Suchin, Lewis, Mike
We present Branch-Train-Stitch (BTS), an efficient and flexible training algorithm for combining independently trained large language model (LLM) experts into a single, capable generalist model. Following Li et al., we start with a single seed language model which is branched into domain-specific (e.g., coding or math) experts with continual pretraining. BTS combines experts into a generalist model using lightweight stitch layers, which are inserted between frozen experts and the seed LLM, and trained on a small datamix of the expert domains. Stitch layers enable the seed LLM to integrate representations from any number of experts during the forward pass, allowing it to generalize to new domains, despite remaining frozen. Because BTS does not alter the constituent LLMs, BTS provides a modular and flexible approach: experts can be easily removed and new experts can be added with only a small amount of training. Compared to alternative model merging approaches, BTS yields the best generalist performance on a variety of downstream tasks, retaining the specialized capabilities of each of the experts.
Effective Theory of Transformers at Initialization
Dinan, Emily, Yaida, Sho, Zhang, Susan
This introduction paves the way for our effective-theory analysis of the backward path in I 3, where we'll figure out how to scale a relative learning-rate factor for each group of model parameters in Transformers. A. Vanilla SGD The SGD update equation is given by ฮธ ยต(t) = ฮธ ยต( t 1) ฮท t L A t ฮธ ยต null null null null ฮธ = ฮธ (t 1), (1.87) where the model-parameter index ยต runs over all the P model parameters ฮธ ยต in the architecture, ฮท t is a learning rate at iteration t, L A t denotes a loss function evaluated on a minibatch A t at iteration t, and ฮธ ยต(0) are drawn from the initialization distribution that was extensively discussed in I 1. 20 In this standard form, we assign the single learning rate ฮท t for all the model parameters, but in theory we'll soon find that the learning rate for each group G of model parameters must be scaled differently as we embiggen Transformers.
Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines
Lee, Andrew, Wu, David, Dinan, Emily, Lewis, Mike
Despite many recent advancements in language modeling, state-of-the-art language models lack grounding in the real world and struggle with tasks involving complex reasoning. Meanwhile, advances in the symbolic reasoning capabilities of AI have led to systems that outperform humans in games like chess and Go (Silver et al., 2018). Chess commentary provides an interesting domain for bridging these two fields of research, as it requires reasoning over a complex board state and providing analyses in natural language. In this work we demonstrate how to combine symbolic reasoning engines with controllable language models to generate chess commentaries. We conduct experiments to demonstrate that our approach generates commentaries that are preferred by human judges over previous baselines.
AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Shi, Weiyan, Dinan, Emily, Renduchintala, Adi, Fried, Daniel, Jacob, Athul Paul, Yu, Zhou, Lewis, Mike
Existing approaches built separate classifiers to detect nonsense in dialogues. In this paper, we show that without external classifiers, dialogue models can detect errors in their own messages introspectively, by calculating the likelihood of replies that are indicative of poor messages. For example, if an agent believes its partner is likely to respond "I don't understand" to a candidate message, that message may not make sense, so an alternative message should be chosen. We evaluate our approach on a dataset from the game Diplomacy, which contains long dialogues richly grounded in the game state, on which existing models make many errors. We first show that hand-crafted replies can be effective for the task of detecting nonsense in applications as complex as Diplomacy. We then design AutoReply, an algorithm to search for such discriminative replies automatically, given a small number of annotated dialogue examples. We find that AutoReply-generated replies outperform handcrafted replies and perform on par with carefully fine-tuned large supervised models. Results also show that one single reply without much computation overheads can also detect dialogue nonsense reasonably well.
Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling
Dinan, Emily, Abercrombie, Gavin, Bergman, A. Stevie, Spruit, Shannon, Hovy, Dirk, Boureau, Y-Lan, Rieser, Verena
Over the last several years, end-to-end neural conversational agents have vastly improved in their ability to carry a chit-chat conversation with humans. However, these models are often trained on large datasets from the internet, and as a result, may learn undesirable behaviors from this data, such as toxic or otherwise harmful language. Researchers must thus wrestle with the issue of how and when to release these models. In this paper, we survey the problem landscape for safety for end-to-end conversational AI and discuss recent and related work. We highlight tensions between values, potential positive impact and potential harms, and provide a framework for making decisions about whether and how to release these models, following the tenets of value-sensitive design. We additionally provide a suite of tools to enable researchers to make better-informed decisions about training and releasing end-to-end conversational AI models.
Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness
Mielke, Sabrina J., Szlam, Arthur, Boureau, Y-Lan, Dinan, Emily
Open-domain dialogue agents have vastly improved, but still confidently hallucinate knowledge or express doubt when asked straightforward questions. In this work, we analyze whether state-of-the-art chit-chat models can express metacognition capabilities through their responses: does a verbalized expression of doubt (or confidence) match the likelihood that the model's answer is incorrect (or correct)? We find that these models are poorly calibrated in this sense, yet we show that the representations within the models can be used to accurately predict likelihood of correctness. By incorporating these correctness predictions into the training of a controllable generation model, we obtain a dialogue agent with greatly improved linguistic calibration.
Recipes for Safety in Open-domain Chatbots
Xu, Jing, Ju, Da, Li, Margaret, Boureau, Y-Lan, Weston, Jason, Dinan, Emily
Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases. We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. We introduce a new human-and-model-in-the-loop framework for both training safer models and for evaluating them, as well as a novel method to distill safety considerations inside generative models without the use of an external classifier at deployment time. We conduct experiments comparing these methods and find our new techniques are (i) safer than existing models as measured by automatic and human evaluations while (ii) maintaining usability metrics such as engagingness relative to the state of the art. We then discuss the limitations of this work by analyzing failure cases of our models.
Deploying Lifelong Open-Domain Dialogue Learning
Shuster, Kurt, Urbanek, Jack, Dinan, Emily, Szlam, Arthur, Weston, Jason
Much of NLP research has focused on crowdsourced static datasets and the supervised learning paradigm of training once and then evaluating test performance. As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013). In contrast, one might hope for machine learning systems that become more useful as they interact with people. In this work, we build and deploy a role-playing game, whereby human players converse with learning agents situated in an open-domain fantasy world. We show that by training models on the conversations they have with humans in the game the models progressively improve, as measured by automatic metrics and online engagement scores. This learning is shown to be more efficient than crowdsourced data when applied to conversations with real users, as well as being far cheaper to collect.
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions
Roller, Stephen, Boureau, Y-Lan, Weston, Jason, Bordes, Antoine, Dinan, Emily, Fan, Angela, Gunning, David, Ju, Da, Li, Margaret, Poff, Spencer, Ringshia, Pratik, Shuster, Kurt, Smith, Eric Michael, Szlam, Arthur, Urbanek, Jack, Williamson, Mary
Further, we discuss only open academic research with entertaining wit and knowledge while making others feel reproducible published results, hence we will not address heard. The breadth of possible conversation topics and lack much of the considerable work that has been put into building of a well-defined objective make it challenging to define a commercial systems, where methods, data and results roadmap towards training a good conversational agent, or are not in the public domain. Finally, given that we focus on chatbot. Despite recent progress across the board (Adiwardana open-domain conversation, we do not focus on specific goaloriented et al., 2020; Roller et al., 2020), conversational agents techniques; we also do not cover spoken dialogue in are still incapable of carrying an open-domain conversation this work, focusing on text and image input/output only. For that remains interesting, consistent, accurate, and reliably more general recent surveys, see Gao et al. (2019); Jurafsky well-behaved (e.g., not offensive) while navigating a variety and Martin (2019); Huang, Zhu, and Gao (2020). of topics. Traditional task-oriented dialogue systems rely on slotfilling and structured modules (e.g., Young et al. (2013); Gao et al. (2019); Jurafsky and Martin (2019)).
Neural Text Generation with Unlikelihood Training
Welleck, Sean, Kulikov, Ilia, Roller, Stephen, Dinan, Emily, Cho, Kyunghyun, Weston, Jason
Neural text generation is a key tool in natural language applications, but it is well known there are major problems at its core. In particular, standard likelihood training and decoding leads to dull and repetitive responses. While some post-hoc fixes have been proposed, in particular top-k and nucleus sampling, they do not address the fact that the token-level probabilities predicted by the model itself are poor. In this paper we show that the likelihood objective itself is at fault, resulting in a model that assigns too much probability to sequences that contain repeats and frequent words unlike the human training distribution. We propose a new objective, unlikelihood training, which forces unlikely generations to be assigned lower probability by the model. We show that both token and sequence level unlikelihood training give less repetitive, less dull text while maintaining perplexity, giving far superior generations using standard greedy or beam search. Our approach provides a strong alternative to traditional training.