AITopics | Stoehr, Niklas

Collaborating Authors

Stoehr, Niklas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Controllable Context Sensitivity and the Knob Behind It

Minder, Julian, Du, Kevin, Stoehr, Niklas, Monea, Giovanni, Wendler, Chris, West, Robert, Cotterell, Ryan

arXiv.org Artificial IntelligenceNov-11-2024

When making predictions, a language model must trade off how much it relies on its context vs. its prior knowledge. Choosing how sensitive the model is to its context is a fundamental functionality, as it enables the model to excel at tasks like retrieval-augmented generation and question-answering. In this paper, we search for a knob which controls this sensitivity, determining whether language models answer from the context or their prior knowledge. To guide this search, we design a task for controllable context sensitivity. In this task, we first feed the model a context (Paris is in England) and a question (Where is Paris?); we then instruct the model to either use its prior or contextual knowledge and evaluate whether it generates the correct answer for both intents (either France or England). When fine-tuned on this task, instruction-tuned versions of Llama-3.1, Mistral-v0.3, and Gemma-2 can solve it with high accuracy (85-95%). Analyzing these high-performing models, we narrow down which layers may be important to context sensitivity using a novel linear time algorithm. Then, in each model, we identify a 1-D subspace in a single layer that encodes whether the model follows context or prior knowledge. Interestingly, while we identify this subspace in a fine-tuned model, we find that the exact same subspace serves as an effective knob in not only that model but also non-fine-tuned instruct and base models of that model family. Finally, we show a strong correlation between a model's performance and how distinctly it separates context-agreeing from context-ignoring answers in this subspace. These results suggest a single subspace facilitates how the model chooses between context and prior knowledge, hinting at a simple fundamental mechanism that controls this behavior.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.07404

Country:

Asia > Middle East (0.93)
Europe > United Kingdom > England (0.44)
Africa > Middle East > Tunisia (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Consumer Products & Services > Travel (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Activation Scaling for Steering and Interpreting Language Models

Stoehr, Niklas, Du, Kevin, Snæbjarnarson, Vésteinn, West, Robert, Cotterell, Ryan, Schein, Aaron

arXiv.org Artificial IntelligenceOct-7-2024

Given the prompt "Rome is in", can we steer a language model to flip its prediction of an incorrect token "France" to a correct token "Italy" by only multiplying a few relevant activation vectors with scalars? We argue that successfully intervening on a model is a prerequisite for interpreting its internal workings. Concretely, we establish a three-term objective: a successful intervention should flip the correct with the wrong token and vice versa (effectiveness), and leave other tokens unaffected (faithfulness), all while being sparse (minimality). Using gradient-based optimization, this objective lets us learn (and later evaluate) a specific kind of efficient and interpretable intervention: activation scaling only modifies the signed magnitude of activation vectors to strengthen, weaken, or reverse the steering directions already encoded in the model. On synthetic tasks, this intervention performs comparably with steering vectors in terms of effectiveness and faithfulness, but is much more minimal allowing us to pinpoint interpretable model components. We evaluate activation scaling from different angles, compare performance on different datasets, and make activation scalars a learnable function of the activation vectors themselves to generalize to varying-length prompts.

intervention, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.04962

Country:

Europe > Italy (0.25)
Europe > France (0.25)
Europe > Poland (0.16)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Context versus Prior Knowledge in Language Models

Du, Kevin, Snæbjarnarson, Vésteinn, Stoehr, Niklas, White, Jennifer C., Schein, Aaron, Cotterell, Ryan

arXiv.org Artificial IntelligenceJun-16-2024

To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to higher exposure in the training corpus, and be more easily persuaded by some contexts than others. To formalize this problem, we propose two mutual information-based metrics to measure a model's dependency on a context and on its prior about an entity: first, the persuasion score of a given context represents how much a model depends on the context in its decision, and second, the susceptibility score of a given entity represents how much the model can be swayed away from its original answer distribution about an entity. We empirically test our metrics for their validity and reliability. Finally, we explore and find a relationship between the scores and the model's expected familiarity with an entity, and provide two use cases to illustrate their benefits.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.04633

Country:

North America > United States (1.00)
Asia (0.67)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (0.98)
Research Report > Experimental Study (0.68)

Industry:

Media (0.93)
Government > Regional Government > North America Government > United States Government (0.67)
Leisure & Entertainment > Sports > Soccer (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Localizing Paragraph Memorization in Language Models

Stoehr, Niklas, Gordon, Mitchell, Zhang, Chiyuan, Lewis, Owen

arXiv.org Machine LearningMar-28-2024

Can we localize the weights and mechanisms used by a language model to memorize and recite entire paragraphs of its training data? In this paper, we show that while memorization is spread across multiple layers and model components, gradients of memorized paragraphs have a distinguishable spatial pattern, being larger in lower model layers than gradients of non-memorized examples. Moreover, the memorized examples can be unlearned by fine-tuning only the high-gradient weights. We localize a low-layer attention head that appears to be especially involved in paragraph memorization. This head is predominantly focusing its attention on distinctive, rare tokens that are least frequent in a corpus-level unigram distribution. Next, we study how localized memorization is across the tokens in the prefix by perturbing tokens and measuring the caused change in the decoding. A few distinctive tokens early in a prefix can often corrupt the entire continuation. Overall, memorized continuations are not only harder to unlearn, but also to corrupt than non-memorized ones.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2403.19851

Country:

Oceania > Australia (0.14)
North America > Canada (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports > Baseball (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Unsupervised Contrast-Consistent Ranking with Language Models

Stoehr, Niklas, Cheng, Pengxiang, Wang, Jing, Preotiuc-Pietro, Daniel, Bhowmik, Rajarshi

arXiv.org Machine LearningSep-13-2023

Language models contain ranking-based knowledge and are powerful solvers of in-context ranking tasks. For instance, they may have parametric knowledge about the ordering of countries by size or may be able to rank reviews by sentiment. Recent work focuses on pairwise, pointwise, and listwise prompting techniques to elicit a language model's ranking knowledge. However, we find that even with careful calibration and constrained decoding, prompting-based techniques may not always be self-consistent in the rankings they produce. This motivates us to explore an alternative approach that is inspired by an unsupervised probing method called Contrast-Consistent Search (CCS). The idea is to train a probing model guided by a logical constraint: a model's representation of a statement and its negation must be mapped to contrastive true-false poles consistently across multiple statements. We hypothesize that similar constraints apply to ranking tasks where all items are related via consistent pairwise or listwise comparisons. To this end, we extend the binary CCS method to Contrast-Consistent Ranking (CCR) by adapting existing ranking methods such as the Max-Margin Loss, Triplet Loss, and Ordinal Regression objective. Our results confirm that, for the same language model, CCR probing outperforms prompting and even performs on a par with prompting much larger language models.

artificial intelligence, ccr, natural language, (17 more...)

arXiv.org Machine Learning

2309.06991

Country:

Europe (1.00)
North America > United States (0.46)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

ACTI at EVALITA 2023: Overview of the Conspiracy Theory Identification Task

Russo, Giuseppe, Stoehr, Niklas, Ribeiro, Manoel Horta

arXiv.org Artificial IntelligenceSep-2-2023

Automatic Conspiracy Theory Identification (ACTI) is a new shared task proposed for the first time at the EVALITA 2023 evaluation campaign. ACTI is based on a new, manually labeled dataset of comments scraped from conspiratorial Telegram channels and consists of two subtasks: (1) identifying conspiratorial content (conspiratorial content classification); and (2) classifying content into specific conspiracy theories (conspiratorial category classification). A total of 15 teams participated in the task with 81 submissions. In this task summary, we discuss the data and task, and outline the bestperforming approaches that are largely based on large language models. We conclude with a brief discussion of the application of large language models to counter the spread of misinformation on online platforms.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2307.06954

Country:

North America > United States (1.00)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.72)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Generalizing Backpropagation for Gradient-Based Interpretability

Du, Kevin, Hennigen, Lucas Torroba, Stoehr, Niklas, Warstadt, Alexander, Cotterell, Ryan

arXiv.org Artificial IntelligenceJul-6-2023

Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs. While these methods can indicate which input features may be important for the model's prediction, they reveal little about the inner workings of the model itself. In this paper, we observe that the gradient computation of a model is a special case of a more general formulation using semirings. This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics about the gradient graph of a neural network, such as the highest-weighted path and entropy. We implement this generalized algorithm, evaluate it on synthetic datasets to better understand the statistics it computes, and apply it to study BERT's behavior on the subject-verb number agreement task (SVA). With this method, we (a) validate that the amount of gradient flow through a component of a model reflects its importance to a prediction and (b) for SVA, identify which pathways of the self-attention mechanism are most important.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2307.03056

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

World Models for Math Story Problems

Opedal, Andreas, Stoehr, Niklas, Saparov, Abulhair, Sachan, Mrinmaya

arXiv.org Artificial IntelligenceJun-7-2023

Solving math story problems is a complex task for students and NLP models alike, requiring them to understand the world as described in the story and reason over it to compute an answer. Recent years have seen impressive performance on automatically solving these problems with large pre-trained language models and innovative techniques to prompt them. However, it remains unclear if these models possess accurate representations of mathematical concepts. This leads to lack of interpretability and trustworthiness which impedes their usefulness in various applications. In this paper, we consolidate previous work on categorizing and representing math story problems and develop MathWorld, which is a graph-based semantic formalism specific for the domain of math story problems. With MathWorld, we can assign world models to math story problems which represent the situations and actions introduced in the text and their mathematical relationships. We combine math story problems from several existing datasets and annotate a corpus of 1,019 problems and 3,204 logical forms with MathWorld. Using this data, we demonstrate the following use cases of MathWorld: (1) prompting language models with synthetically generated question-answer pairs to probe their reasoning and world modeling abilities, and (2) generating new problems by using the world models as a design space.

artificial intelligence, container, natural language, (15 more...)

arXiv.org Artificial Intelligence

2306.04347

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.83)

Industry:

Education (1.00)
Leisure & Entertainment > Games (0.46)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

An Ordinal Latent Variable Model of Conflict Intensity

Stoehr, Niklas, Hennigen, Lucas Torroba, Valvoda, Josef, West, Robert, Cotterell, Ryan, Schein, Aaron

arXiv.org Artificial IntelligenceJun-4-2023

Measuring the intensity of events is crucial for monitoring and tracking armed conflict. Advances in automated event extraction have yielded massive data sets of "who did what to whom" micro-records that enable data-driven approaches to monitoring conflict. The Goldstein scale is a widely-used expert-based measure that scores events on a conflictual-cooperative scale. It is based only on the action category ("what") and disregards the subject ("who") and object ("to whom") of an event, as well as contextual information, like associated casualty count, that should contribute to the perception of an event's "intensity". This paper takes a latent variable-based approach to measuring conflict intensity. We introduce a probabilistic generative model that assumes each observed event is associated with a latent intensity class. A novel aspect of this model is that it imposes an ordering on the classes, such that higher-valued classes denote higher levels of intensity. The ordinal nature of the latent variable is induced from naturally ordered aspects of the data (e.g., casualty counts) where higher values naturally indicate higher intensity. We evaluate the proposed model both intrinsically and extrinsically, showing that it obtains comparatively good held-out predictive performance.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2210.03971

Country:

Asia > Middle East (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Government > Military (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
(2 more...)

Add feedback

Rethinking the Event Coding Pipeline with Prompt Entailment

Lefebvre, Clément, Stoehr, Niklas

arXiv.org Artificial IntelligenceMay-5-2023

For monitoring crises, political events are extracted from the news. The large amount of unstructured full-text event descriptions makes a case-by-case analysis unmanageable, particularly for low-resource humanitarian aid organizations. This creates a demand to classify events into event types, a task referred to as event coding. Typically, domain experts craft an event type ontology, annotators label a large dataset and technical experts develop a supervised coding system. In this work, we propose PR-ENT, a new event coding approach that is more flexible and resource-efficient, while maintaining competitive accuracy: first, we extend an event description such as "Military injured two civilians'' by a template, e.g. "People were [Z]" and prompt a pre-trained (cloze) language model to fill the slot Z. Second, we select answer candidates Z* = {"injured'', "hurt"...} by treating the event description as premise and the filled templates as hypothesis in a textual entailment task. This allows domain experts to draft the codebook directly as labeled prompts and interpretable answer candidates. This human-in-the-loop process is guided by our interactive codebook design tool. We evaluate PR-ENT in several robustness checks: perturbing the event description and prompt template, restricting the vocabulary and removing contextual information.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2210.05257

Country: Asia (0.28)

Genre: Research Report (0.50)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)

Add feedback