AITopics

2209.11133

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceSep-23-2022

In-context Learning and Induction Heads

Olsson, Catherine, Elhage, Nelson, Nanda, Neel, Joseph, Nicholas, DasSarma, Nova, Henighan, Tom, Mann, Ben, Askell, Amanda, Bai, Yuntao, Chen, Anna, Conerly, Tom, Drain, Dawn, Ganguli, Deep, Hatfield-Dodds, Zac, Hernandez, Danny, Johnston, Scott, Jones, Andy, Kernion, Jackson, Lovitt, Liane, Ndousse, Kamal, Amodei, Dario, Brown, Tom, Clark, Jack, Kaplan, Jared, McCandlish, Sam, Olah, Chris

"Induction heads" are attention heads that implement a simple algorithm to complete token sequences like [A][B] ... [A] -> [B]. In this work, we present preliminary and indirect evidence for a hypothesis that induction heads might constitute the mechanism for the majority of all "in-context learning" in large transformer models (i.e. decreasing loss at increasing token indices). We find that induction heads develop at precisely the same point as a sudden sharp increase in in-context learning ability, visible as a bump in the training loss. We present six complementary lines of evidence, arguing that induction heads may be the mechanistic source of general in-context learning in transformer models of any size. For small attention-only models, we present strong, causal evidence; for larger models with MLPs, we present correlational evidence.

large language model, machine learning, natural language, (20 more...)

2209.11895

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Raina, Vatsal, Gales, Mark

Multiple-Choice Question Generation: Towards an Automated Assessment Framework

arXiv.org Artificial IntelligenceSep-23-2022

Automated question generation is an important approach to enable personalisation of English comprehension assessment. Recently, transformer-based pretrained language models have demonstrated the ability to produce appropriate questions from a context paragraph. Typically, these systems are evaluated against a reference set of manually generated questions using n-gram based metrics, or manual qualitative assessment. Here, we focus on a fully automated multiple-choice question generation (MCQG) system where both the question and possible answers must be generated from the context paragraph. Applying n-gram based approaches is challenging for this form of system as the reference set is unlikely to capture the full range of possible questions and answer options. Conversely manual assessment scales poorly and is expensive for MCQG system development. In this work, we propose a set of performance criteria that assess different aspects of the generated multiple-choice questions of interest. These qualities include: grammatical correctness, answerability, diversity and complexity. Initial systems for each of these metrics are described, and individually evaluated on standard multiple-choice reading comprehension corpora.

large language model, machine learning, question answering, (20 more...)

2209.1183

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(7 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report (0.82)

Industry:

Education > Educational Setting (0.46)
Education > Assessment & Standards > Student Performance (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

MIT Technology ReviewSep-22-2022, 14:00:00 GMT

DeepMind's new chatbot uses Google searches plus humans to give better answers

The difference between this approach and its predecessors is that DeepMind hopes to use "dialogue in the long term for safety," says Geoffrey Irving, a safety researcher at DeepMind. "That means we don't expect that the problems that we face in these models--either misinformation or stereotypes or whatever--are obvious at first glance, and we want to talk through them in detail. And that means between machines and humans as well," he says. DeepMind's idea of using human preferences to optimize how an AI model learns is not new, says Sara Hooker, who leads Cohere for AI, a nonprofit AI research lab. "But the improvements are convincing and show clear benefits to human-guided optimization of dialogue agents in a large-language-model setting," says Hooker. Douwe Kiela, a researcher at AI startup Hugging Face, says Sparrow is "a nice next step that follows a general trend in AI, where we are more seriously trying to improve the safety aspects of large-language-model deployments."

deepmind, new chatbot use google search, use google search plus human, (3 more...)

MIT Technology Review

Industry: Information Technology > Services (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

ProgPrompt: Generating Situated Robot Task Plans using Large Language Models

Singh, Ishika, Blukis, Valts, Mousavian, Arsalan, Goyal, Ankit, Xu, Danfei, Tremblay, Jonathan, Fox, Dieter, Thomason, Jesse, Garg, Animesh

Everyday household tasks require both commonsense understanding of the world and situated knowledge about the words, which then need to be mapped to actions and world current environment. To create a task plan for "Make dinner," objects available to the agent. For example, if the LLM an agent needs common sense: object affordances, such as produced "reach in and pick up the jar of pickles," that that the stove and microwave can be used for heating; logical string would have to neatly map to an executable action like sequences of actions, such as an oven must be preheated before "pick up jar." A key component missing in LLM-based task food is added; and task relevance of objects and actions, planning is state feedback from the environment. The fridge such as heating and food are actions related to "dinner" in the in the house might not contain chicken, soda, or pickles, first place. However, this reasoning is infeasible without state but a high-level instruction "Make dinner" doesn't give us feedback. The agent needs to know what food is available in that world state information. Our work introduces situatedawareness the current environment, such as whether the freezer contains in LLM-based robot task planning.

large language model, microwave, natural language, (19 more...)

2209.11302

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.94)

A Case Report On The "A.I. Locked-In Problem": social concerns with modern NLP

Walter, Yoshija

Modern NLP models are becoming better conversational agents than their predecessors. Recurrent Neural Networks (RNNs) and especially Long-Short Term Memory (LSTM) features allow the agent to better store and use information about semantic content, a trend that has become even more pronounced with the Transformer Models. Large Language Models (LLMs) such as GPT-3 by OpenAI have become known to be able to construct and follow a narrative, which enables the system to adopt personas on the go, adapt them and play along in conversational stories. However, practical experimentation with GPT-3 shows that there is a recurring problem with these modern NLP systems, namely that they can "get stuck" in the narrative so that further conversations, prompt executions or commands become futile. This is here referred to as the "Locked-In Problem" and is exemplified with an experimental case report, followed by practical and social concerns that are accompanied with this problem.

large language model, machine learning, natural language, (21 more...)

2209.12687

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
North America > Canada (0.04)
(6 more...)

Genre: Personal > Interview (0.93)

Industry:

Media (1.00)
Leisure & Entertainment (0.94)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.52)

Gatto, Joseph, Basak, Madhusudan, Preum, Sarah M.

Scope of Pre-trained Language Models for Detecting Conflicting Health Information

An increasing number of people now rely on online platforms to meet their health information needs. Thus identifying inconsistent or conflicting textual health information has become a safety-critical task. Health advice data poses a unique challenge where information that is accurate in the context of one diagnosis can be conflicting in the context of another. For example, people suffering from diabetes and hypertension often receive conflicting health advice on diet. This motivates the need for technologies which can provide contextualized, user-specific health advice. A crucial step towards contextualized advice is the ability to compare health advice statements and detect if and how they are conflicting. This is the task of health conflict detection (HCD). Given two pieces of health advice, the goal of HCD is to detect and categorize the type of conflict. It is a challenging task, as (i) automatically identifying and categorizing conflicts requires a deeper understanding of the semantics of the text, and (ii) the amount of available data is quite limited. In this study, we are the first to explore HCD in the context of pre-trained language models. We find that DeBERTa-v3 performs best with a mean F1 score of 0.68 across all experiments. We additionally investigate the challenges posed by different conflict types and how synthetic data improves a model's understanding of conflict-specific semantics. Finally, we highlight the difficulty in collecting real health conflicts and propose a human-in-the-loop synthetic data augmentation approach to expand existing HCD datasets. Our HCD training dataset is over 2x bigger than the existing HCD dataset and is made publicly available on Github.

large language model, machine learning, natural language, (18 more...)

2209.11102

Country:

North America > United States (0.04)
North America > Dominican Republic (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Wambsganss, Thiemo, Swamy, Vinitra, Rietsche, Roman, Käser, Tanja

Bias at a Second Glance: A Deep Dive into Bias for German Educational Peer-Review Data Modeling

Natural Language Processing (NLP) has become increasingly utilized to provide adaptivity in educational applications. However, recent research has highlighted a variety of biases in pre-trained language models. While existing studies investigate bias in different domains, they are limited in addressing fine-grained analysis on educational and multilingual corpora. In this work, we analyze bias across text and through multiple architectures on a corpus of 9,165 German peer-reviews collected from university students over five years. Notably, our corpus includes labels such as helpfulness, quality, and critical aspect ratings from the peer-review recipient as well as demographic attributes. We conduct a Word Embedding Association Test (WEAT) analysis on (1) our collected corpus in connection with the clustered labels, (2) the most common pre-trained German language models (T5, BERT, and GPT-2) and GloVe embeddings, and (3) the language models after fine-tuning on our collected data-set. In contrast to our initial expectations, we found that our collected corpus does not reveal many biases in the co-occurrence analysis or in the GloVe embeddings. However, the pre-trained German language models find substantial conceptual, racial, and gender bias and have significant changes in bias across conceptual and racial axes during fine-tuning on the peer-review data. With our research, we aim to contribute to the fourth UN sustainability goal (quality education) with a novel dataset, an understanding of biases in natural language education data, and the potential harms of not counteracting biases in language models for educational tasks.

large language model, machine learning, natural language, (20 more...)

2209.10335

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(10 more...)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.48)

Industry:

Education > Educational Setting (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

Yuan, Xingdi, Wang, Tong, Wang, Yen-Hsiang, Fine, Emery, Abdelghani, Rania, Lucas, Pauline, Sauzéon, Hélène, Oudeyer, Pierre-Yves

Large Language Models (LLMs) have in recent years demonstrated impressive prowess in natural language generation. A common practice to improve generation diversity is to sample multiple outputs from the model. However, there lacks a simple and robust way of selecting the best output from these stochastic samples. As a case study framed in the context of question generation, we propose two prompt-based approaches to selecting high-quality questions from a set of LLM-generated candidates. Our method works under the constraints of 1) a black-box (non-modifiable) question generation model and 2) lack of access to human-annotated references -- both of which are realistic limitations for real-world deployment of LLMs. With automatic as well as human evaluations, we empirically demonstrate that our approach can effectively select questions of higher qualities than greedy generation.

computational linguistic, large language model, machine learning, (18 more...)

2209.11

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation

Hong, Seongmin, Moon, Seungjae, Kim, Junsoo, Lee, Sungjae, Kim, Minsub, Lee, Dongsoo, Kim, Joo-Young

Transformer is a deep learning language model widely used for natural language processing (NLP) services in datacenters. Among transformer models, Generative Pre-trained Transformer (GPT) has achieved remarkable performance in text generation, or natural language generation (NLG), which needs the processing of a large input context in the summarization stage, followed by the generation stage that produces a single word at a time. The conventional platforms such as GPU are specialized for the parallel processing of large inputs in the summarization stage, but their performance significantly degrades in the generation stage due to its sequential characteristic. Therefore, an efficient hardware platform is required to address the high latency caused by the sequential characteristic of text generation. In this paper, we present DFX, a multi-FPGA acceleration appliance that executes GPT-2 model inference end-to-end with low latency and high throughput in both summarization and generation stages. DFX uses model parallelism and optimized dataflow that is model-and-hardware-aware for fast simultaneous workload execution among devices. Its compute cores operate on custom instructions and provide GPT-2 operations end-to-end. We implement the proposed hardware architecture on four Xilinx Alveo U280 FPGAs and utilize all of the channels of the high bandwidth memory (HBM) and the maximum number of compute resources for high hardware efficiency. DFX achieves 5.58x speedup and 3.99x energy efficiency over four NVIDIA V100 GPUs on the modern GPT-2 model. DFX is also 8.21x more cost-effective than the GPU appliance, suggesting that it is a promising solution for text generation workloads in cloud datacenters.

large language model, machine learning, natural language, (21 more...)

2209.10797

Country:

Asia > South Korea > Daejeon > Daejeon (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.84)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)