Goto

Collaborating Authors

 pushshift






Are Female Carpenters like Blue Bananas? A Corpus Investigation of Occupation Gender Typicality

arXiv.org Artificial Intelligence

People tend to use language to mention surprising properties of events: for example, when a banana is blue, we are more likely to mention color than when it is yellow. This fact is taken to suggest that yellowness is somehow a typical feature of bananas, and blueness is exceptional. Similar to how a yellow color is typical of bananas, there may also be genders that are typical of occupations. In this work, we explore this question using information theoretic techniques coupled with corpus statistic analysis. In two distinct large corpora, we do not find strong evidence that occupations and gender display the same patterns of mentioning as do bananas and color. Instead, we find that gender mentioning is correlated with femaleness of occupation in particular, suggesting perhaps that woman-dominated occupations are seen as somehow ``more gendered'' than male-dominated ones, and thereby they encourage more gender mentioning overall.


Now It Sounds Like You: Learning Personalized Vocabulary On Device

arXiv.org Artificial Intelligence

In recent years, Federated Learning (FL) has shown significant advancements in its ability to perform various natural language processing (NLP) tasks. This work focuses on applying personalized FL for on-device language modeling. Due to limitations of memory and latency, these models cannot support the complexity of sub-word tokenization or beam search decoding, resulting in the decision to deploy a closed-vocabulary language model. However, closed-vocabulary models are unable to handle out-of-vocabulary (OOV) words belonging to specific users. To address this issue, We propose a novel technique called "OOV expansion" that improves OOV coverage and increases model accuracy while minimizing the impact on memory and latency. This method introduces a personalized "OOV adapter" that effectively transfers knowledge from a central model and learns word embedding for personalized vocabulary. OOV expansion significantly outperforms standard FL personalization methods on a set of common FL benchmarks.


Reddit Is Already on the Rebound

WIRED

Social media researchers at the Network Contagion Research Institute in Princeton, New Jersey, got a rude awakening early last month. They were roused by 6:30 am phone calls from a colleague warning that Reddit had started blocking the institute's Pushshift service from updating its ongoing archive of every post on the discussion platform. That was a problem for more than just NCRI, because some of Reddit's 50,000 volunteer moderators depend on Pushshift to quickly investigate problem users, and many academics rely on the service. If it went stale, mods, as Reddit calls moderators, would have to work overtime or let more trash content accumulate. Researchers studying online communities would be forced to put projects and doctoral dissertations on ice.


An AI Twitter bot that only tweets good news, with Python and GPT2

#artificialintelligence

Running AI these days is increasingly simple due to the hard work of open source contributors producing top-notch libraries out there, and research groups opening up their work so others can build on it. One key library doing that is HuggingFace's Transformers library. HuggingFace are a startup building, amongst other NLP-related products, a library and model ecosystem that allows almost anyone to quickly and easily set up AI-powered chat bots that can consume or produce natural language. In this post, I'll demonstrate how I used this library to produce a Twitter bot that is only tweeting made-up (and slightly quirky) good news This blog post isn't meant to explain any theory, but for those who aren't familiar, the easiest way to explain this kind of AI, is they're sophisticated pattern recognition systems. If you feed it enough data, it can build up an ability to recognize the patterns in the english language, to the extent that if you ask it to repeat the pattern, not only will it generate mostly correct English grammar, it might also from time to time generate a coherent sentence!


SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures

arXiv.org Artificial Intelligence

Current open-domain conversational models can easily be made to talk in inadequate ways. Online learning from conversational feedback given by the conversation partner is a promising avenue for a model to improve and adapt, so as to generate fewer of these safety failures. However, current state-of-the-art models tend to react to feedback with defensive or oblivious responses. This makes for an unpleasant experience and may discourage conversation partners from giving feedback in the future. This work proposes SaFeRDialogues, a task and dataset of graceful responses to conversational feedback about safety failures. We collect a dataset of 10k dialogues demonstrating safety failures, feedback signaling them, and a response acknowledging the feedback. We show how fine-tuning on this dataset results in conversations that human raters deem considerably more likely to lead to a civil conversation, without sacrificing engagingness or general conversational ability.


Recipes for Safety in Open-domain Chatbots

arXiv.org Artificial Intelligence

Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases. We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. We introduce a new human-and-model-in-the-loop framework for both training safer models and for evaluating them, as well as a novel method to distill safety considerations inside generative models without the use of an external classifier at deployment time. We conduct experiments comparing these methods and find our new techniques are (i) safer than existing models as measured by automatic and human evaluations while (ii) maintaining usability metrics such as engagingness relative to the state of the art. We then discuss the limitations of this work by analyzing failure cases of our models.