Goto

Collaborating Authors

 Large Language Model


Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models

arXiv.org Artificial Intelligence

Pre-trained language models (LMs) are shown to easily generate toxic language. In this work, we systematically explore domain-adaptive training to reduce the toxicity of language models. We conduct this study on three dimensions: training corpus, model size, and parameter efficiency. For the training corpus, we propose to leverage the generative power of LMs and generate nontoxic datasets for domain-adaptive training, which mitigates the exposure bias and is shown to be more data-efficient than using a curated pre-training corpus. We demonstrate that the self-generation method consistently outperforms the existing baselines across various model sizes on both automatic and human evaluations, even when it uses a 1/3 smaller training corpus. We then comprehensively study detoxifying LMs with parameter sizes ranging from 126M up to 530B (3x larger than GPT-3), a scale that has never been studied before. We find that i) large LMs have similar toxicity levels as smaller ones given the same pre-training corpus, and ii) large LMs require more endeavor to detoxify. We also explore parameter-efficient training methods for detoxification. We demonstrate that adding and training adapter-only layers in LMs not only saves a lot of parameters but also achieves a better trade-off between toxicity and perplexity than whole model adaptation for the large-scale models.


DeepMind's AI programming tool AlphaCode tests in top 54% of human coders

#artificialintelligence

The team at DeepMind has tested the programming skills of its AI programming tool AlphaCode against human programmer competitors and has found it tested in the top 54 percent of human coders. In their preprint article, the group at DeepMind suggests that its programming application has opened the door to future tools that could make programming easier and more accessible. The team has also posted a page on its blog site outlining the progress being made with AlphaCode. Research teams have been working steadily over the past several years to apply artificial intelligence to computer programming. The goal is to create AI systems that are capable of writing code for computer applications that are more sophisticated than those currently created by human coders.


The NLP Cypher

#artificialintelligence

"The biggest downside for the OpenAI embeddings endpoint is the high costs (about 8,000–600,000 times more expensive than open models on your infrastructure), the high dimensionality of up to 12288 dimensions (making downstream applications slow), and the extreme latency when computing embeddings. This hinders the actual usage of the embeddings for any search applications." FYI: I had previously written about this issue over a year ago and even provided a search engine, it seems now more peeps are on top of this issue.


Red Teaming Language Models with Language Models

arXiv.org Artificial Intelligence

Language Models (LMs) often cannot be deployed because of their potential to harm users in hard-to-predict ways. Prior work identifies harmful behaviors before deployment by using human annotators to hand-write test cases. However, human annotation is expensive, limiting the number and diversity of test cases. In this work, we automatically find cases where a target LM behaves in a harmful way, by generating test cases ("red teaming") using another LM. We evaluate the target LM's replies to generated test questions using a classifier trained to detect offensive content, uncovering tens of thousands of offensive replies in a 280B parameter LM chatbot. We explore several methods, from zero-shot generation to reinforcement learning, for generating test cases with varying levels of diversity and difficulty. Furthermore, we use prompt engineering to control LM-generated test cases to uncover a variety of other harms, automatically finding groups of people that the chatbot discusses in offensive ways, personal and hospital phone numbers generated as the chatbot's own contact info, leakage of private training data in generated text, and harms that occur over the course of a conversation. Overall, LM-based red teaming is one promising tool (among many needed) for finding and fixing diverse, undesirable LM behaviors before impacting users.


AI In Healthcare Highlights & Milestones 2021

#artificialintelligence

In 2021 the application of AI enabled advances in many areas of healthcare. We made significant progress in AI for drug discovery, medical imaging, diagnostics, pathology, and clinical trials. Important peer reviewed papers were published and dozens of partnerships were formed. Big Pharma companies and major tech companies became very active in the space. Record amounts of funding were raised, and a few companies even started human clinical trials. Microsoft and NVIDIA launched two of the world's most powerful supercomputers and Microsoft announced Azure OpenAI Service. In 2022 we expect these technologies to converge across the healthcare spectrum. This article summarizes milestones achieved in 2021. This is the first in a series of progress reports I'm writing on the sector that will be supplemented by industry performance data and metrics compiled in partnership with Alliance for Artificial Intelligence in Healthcare (AAIH) and other top tier resources.


What is GitHub Copilot?

#artificialintelligence

Is artificial intelligence a threat to programmers or a boon? Let's take a deeper look at Github Copilot, our next-generation coding companion, which combines OpenAI's "GPT-3" artificial intelligence with Github's public repos. "Trying to code in an unfamiliar language by googling everything is like navigating a foreign country with just a phrase book. Using GitHub Copilot is like hiring an interpreter." Github copilot is a coding helper powered by artificial intelligence that was launched on June 29, 2021.


NeurIPS 2021 – 10 Papers You Shouldn't Miss

#artificialintelligence

Authors' TL;DR We use self-supervised play to train artificial agents to communicate by drawing and then show that with the appropriate inductive bias a human can successfully play the same games with the pretrained drawing agent.


Ethics, Rules of Engagement, and AI: Neural Narrative Mapping Using Large Transformer Language Models

arXiv.org Artificial Intelligence

The problem of determining if a military unit has correctly understood an order and is properly executing on it is one that has bedeviled military planners throughout history. The advent of advanced language models such as OpenAI's GPT-series offers new possibilities for addressing this problem. This paper presents a mechanism to harness the narrative output of large language models and produce diagrams or "maps" of the relationships that are latent in the weights of such models as the GPT-3. The resulting "Neural Narrative Maps" (NNMs), are intended to provide insight into the organization of information, opinion, and belief in the model, which in turn provide means to understand intent and response in the context of physical distance. This paper discusses the problem of mapping information spaces in general, and then presents a concrete implementation of this concept in the context of OpenAI's GPT-3 language model for determining if a subordinate is following a commander's intent in a high-risk situation. The subordinate's locations within the NNM allow a novel capability to evaluate the intent of the subordinate with respect to the commander. We show that is is possible not only to determine if they are nearby in narrative space, but also how they are oriented, and what "trajectory" they are on. Our results show that our method is able to produce high-quality maps, and demonstrate new ways of evaluating intent more generally. N the 1979 motion picture Apocalypse Now, Captain Willard (played by Martin Sheen) is sent on a mission to assassinate Colonel Kurtz (played by Marlon Brando), a highly decorated officer who, in the words of the general authorizing the mission, has gone from "one of the most outstanding officers this country has ever produced" to someone "out there operating without any decent restraint, totally beyond the pale of any acceptable human conduct." The movie explores the paradoxes in war, where some illegal acts are embraced by the command structure, some tolerated, and some are to be terminated, "with extreme prejudice." Willard has to navigate these conflicts as he moves towards Kurtz' compound deep in Cambodia. Apocalypse Now provides an example of the difficulty that any intent-aware system must face in a military context [1]. Not only does the system need to determine if an order is being followed, it should also determine if the order itself is valid, so that the warriors implementing the order are not placed in ethical dilemmas. This is the goal that we attempt to address in this paper, with the concept of Neural Narrative Mapping (NNM). By placing narrative elements at coordinates in a virtual space, we can determine sophisticated relationships between concepts that go well beyond textual comparison.


Artificial Intelligence Innovation: The Future With OpenAI GPT-3

#artificialintelligence

GPT-3 is the 3rd release of the OpenAI collection of Generative Pre-Trained models. GPT-1 and GPT-2 laid the foundations for GPT-3, proving the success of two key hypotheses: Transformers unsupervised pre-training works fine (GPT-1), and language models can multitask (GPT-2). GPT-3 is a language model built on the transformer architecture and pre-trained in an unsupervised, generative manner which has a decent performance in one-shot, zero-shot & few-shot multitask settings. It functions by anticipating the next token in the sequence of tokens, and it can do this for NLP tasks that it's not been taught. After some instances, it reached the highest performance in specific benchmarks, like machine translating, Q&A, and Cloze tasks. GPT-3 was trained on massive Internet text databases, a total of 570GB.


Codeforces Round #770 (Div. 2) - Codeforces

#artificialintelligence

This round will be rated for all participants with a rating of strictly less than 2100. You will have 2 hours and 30 minutes to solve 6 problems. There will be an interactive problem in the round, so we recommend all new participants to read Interactive Problems Guide. We are happy to be the authors of the first round that AlphaCode will participate in. We will watch SelectorUnlimited, WaggleCollide и AngularNumeric in standings.