AITopics | Seifert, Christin

Collaborating Authors

Seifert, Christin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Guiding LLMs to Generate High-Fidelity and High-Quality Counterfactual Explanations for Text Classification

Nguyen, Van Bach, Seifert, Christin, Schlötterer, Jörg

arXiv.org Artificial IntelligenceMar-6-2025

The need for interpretability in deep learning has driven interest in counterfactual explanations, which identify minimal changes to an instance that change a model's prediction. Current counterfactual (CF) generation methods require task-specific fine-tuning and produce low-quality text. Large Language Models (LLMs), though effective for high-quality text generation, struggle with label-flipping counterfactuals (i.e., counterfactuals that change the prediction) without fine-tuning. We introduce two simple classifier-guided approaches to support counterfactual generation by LLMs, eliminating the need for fine-tuning while preserving the strengths of LLMs. Despite their simplicity, our methods outperform state-of-the-art counterfactual generation methods and are effective across different LLMs, highlighting the benefits of guiding counterfactual generation by LLMs with classifier information. We further show that data augmentation by our generated CFs can improve a classifier's robustness. Our analysis reveals a critical issue in counterfactual generation by LLMs: LLMs rely on parametric knowledge rather than faithfully following the classifier.

classifier, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2503.04463

Country:

North America > Canada (0.28)
Asia > Middle East > UAE (0.14)
North America > United States > Oregon (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Behavioral Analysis of Information Salience in Large Language Models

Trienes, Jan, Schlötterer, Jörg, Li, Junyi Jessy, Seifert, Christin

arXiv.org Artificial IntelligenceFeb-20-2025

Large Language Models (LLMs) excel at text summarization, a task that requires models to select content based on its importance. However, the exact notion of salience that LLMs have internalized remains unclear. To bridge this gap, we introduce an explainable framework to systematically derive and investigate information salience in LLMs through their summarization behavior. Using length-controlled summarization as a behavioral probe into the content selection process, and tracing the answerability of Questions Under Discussion throughout, we derive a proxy for how models prioritize information. Our experiments on 13 models across four datasets reveal that LLMs have a nuanced, hierarchical notion of salience, generally consistent across model families and sizes. While models show highly consistent behavior and hence salience patterns, this notion of salience cannot be accessed through introspection, and only weakly correlates with human perceptions of information salience.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.14613

Country:

Asia (0.46)
North America > United States > Texas (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

This looks like what? Challenges and Future Research Directions for Part-Prototype Models

Elhadri, Khawla, Michalski, Tomasz, Wróbel, Adam, Schlötterer, Jörg, Zieliński, Bartosz, Seifert, Christin

arXiv.org Artificial IntelligenceFeb-13-2025

The growing interest in eXplainable Artificial Intelligence (XAI) has prompted research into models with built-in interpretability, the most prominent of which are part-prototype models. Part-Prototype Models (PPMs) make decisions by comparing an input image to a set of learned prototypes, providing human-understandable explanations in the form of ``this looks like that''. Despite their inherent interpretability, PPMS are not yet considered a valuable alternative to post-hoc models. In this survey, we investigate the reasons for this and provide directions for future research. We analyze papers from 2019 to 2024, and derive a taxonomy of the challenges that current PPMS face. Our analysis shows that the open challenges are quite diverse. The main concern is the quality and quantity of prototypes. Other concerns are the lack of generalization to a variety of tasks and contexts, and general methodological issues, including non-standardized evaluation. We provide ideas for future research in five broad directions: improving predictive performance, developing novel architectures grounded in theory, establishing frameworks for human-AI collaboration, aligning models with humans, and establishing metrics and benchmarks for evaluation. We hope that this survey will stimulate research and promote intrinsically interpretable models for application domains. Our list of surveyed papers is available at https://github.com/aix-group/ppm-survey.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.0934

Country: Europe (0.14)

Genre: Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
(2 more...)

Add feedback

Position: Editing Large Language Models Poses Serious Safety Risks

Youssef, Paul, Zhao, Zhixue, Braun, Daniel, Schlötterer, Jörg, Seifert, Christin

arXiv.org Artificial IntelligenceFeb-5-2025

Large Language Models (LLMs) contain large amounts of facts about the world. These facts can become outdated over time, which has led to the development of knowledge editing methods (KEs) that can change specific facts in LLMs with limited side effects. This position paper argues that editing LLMs poses serious safety risks that have been largely overlooked. First, we note the fact that KEs are widely available, computationally inexpensive, highly performant, and stealthy makes them an attractive tool for malicious actors. Second, we discuss malicious use cases of KEs, showing how KEs can be easily adapted for a variety of malicious purposes. Third, we highlight vulnerabilities in the AI ecosystem that allow unrestricted uploading and downloading of updated models without verification. Fourth, we argue that a lack of social and institutional awareness exacerbates this risk, and discuss the implications for different stakeholders. We call on the community to (i) research tamper-resistant models and countermeasures against malicious model editing, and (ii) actively engage in securing the AI ecosystem.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.02958

Country:

Asia (1.00)
Europe (0.68)
North America > United States > California (0.14)
North America > Mexico > Mexico City (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government (0.93)
Media > News (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Kuhn, Lukas, Sadiya, Sari, Schlotterer, Jorg, Seifert, Christin, Roig, Gemma

arXiv.org Artificial IntelligenceJan-1-2025

Shortcut learning, i.e., a model's reliance on undesired features not directly relevant to the task, is a major challenge that severely limits the applications of machine learning algorithms, particularly when deploying them to assist in making sensitive decisions, such as in medical diagnostics. In this work, we leverage recent advancements in machine learning to create an unsupervised framework that is capable of both detecting and mitigating shortcut learning in transformers. We validate our method on multiple datasets. Results demonstrate that our framework significantly improves both worst-group accuracy (samples misclassified due to shortcuts) and average accuracy, while minimizing human annotation effort. Moreover, we demonstrate that the detected shortcuts are meaningful and informative to human experts, and that our framework is computationally efficient, allowing it to be run on consumer hardware.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.00942

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

Enhancing Fact Retrieval in PLMs through Truthfulness

Youssef, Paul, Schlötterer, Jörg, Seifert, Christin

arXiv.org Artificial IntelligenceOct-17-2024

Pre-trained Language Models (PLMs) encode various facts about the world at their pre-training phase as they are trained to predict the next or missing word in a sentence. There has a been an interest in quantifying and improving the amount of facts that can be extracted from PLMs, as they have been envisioned to act as soft knowledge bases, which can be queried in natural language. Different approaches exist to enhance fact retrieval from PLM. Recent work shows that the hidden states of PLMs can be leveraged to determine the truthfulness of the PLMs' inputs. Leveraging this finding to improve factual knowledge retrieval remains unexplored. In this work, we investigate the use of a helper model to improve fact retrieval. The helper model assesses the truthfulness of an input based on the corresponding hidden states representations from the PLMs. We evaluate this approach on several masked PLMs and show that it enhances fact retrieval by up to 33\%. Our findings highlight the potential of hidden states representations from PLMs in improving their factual knowledge retrieval.

computational linguistic, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2410.13562

Country:

Europe (0.94)
Asia > Thailand (0.47)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Can We Reverse In-Context Knowledge Edits?

Youssef, Paul, Zhao, Zhixue, Schlötterer, Jörg, Seifert, Christin

arXiv.org Artificial IntelligenceOct-16-2024

In-context knowledge editing (IKE) enables efficient modification of large language model (LLM) outputs without parameter changes and at zero-cost. However, it can be misused to manipulate responses opaquely, e.g., insert misinformation or offensive content. Such malicious interventions could be incorporated into high-level wrapped APIs where the final input prompt is not shown to end-users. To address this issue, we investigate the detection and reversal of IKE-edits. First, we demonstrate that IKE-edits can be detected with high accuracy (F1 > 80\%) using only the top-10 output probabilities of the next token, even in a black-box setting, e.g. proprietary LLMs with limited output information. Further, we introduce the novel task of reversing IKE-edits using specially tuned reversal tokens. We explore using both continuous and discrete reversal tokens, achieving over 80\% accuracy in recovering original, unedited outputs across multiple LLMs. Our continuous reversal tokens prove particularly effective, with minimal impact on unedited prompts. Through analysis of output distributions, attention patterns, and token rankings, we provide insights into IKE's effects on LLMs and how reversal tokens mitigate them. This work represents a significant step towards enhancing LLM resilience against potential misuse of in-context editing, improving their transparency and trustworthiness.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.12586

Country: Asia > Middle East (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Investigating the Impact of Randomness on Reproducibility in Computer Vision: A Study on Applications in Civil Engineering and Medicine

Eryılmaz, Bahadır, Koraş, Osman Alperen, Schlötterer, Jörg, Seifert, Christin

arXiv.org Artificial IntelligenceSep-19-2024

Abstract--Reproducibility is essential for scientific research. CUDA's advantages for accelerating algorithm execution on GPUs, if not controlled, its behavior across multiple executions This reality emphasizes the importance of The reproducibility crisis in machine learning is a growing understanding CUDA-induced randomness and its impact on concern that questions the reliability and validity of reported model performance in real-world applications. One survey shows that not all researchers lack of such insight could lead researchers to overestimate or are aware of this problem [2]. This issue stems from the difficulty underestimate the capabilities of machine learning algorithms, in replicating results due to various unknown and poorly which in turn could misdirect the efforts of the research understood factors, including but not limited to differences community. Deep learning architectures, as they are widely implications on reproducibility, and its broader implications on used in computer vision, with their complex, multi-layered real-world computer vision applications.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.02806

Country:

Asia (0.46)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Energy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.47)
Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Detecting Edited Knowledge in Language Models

Youssef, Paul, Zhao, Zhixue, Schlötterer, Jörg, Seifert, Christin

arXiv.org Artificial IntelligenceJul-1-2024

Knowledge editing methods (KEs) can update language models' obsolete or inaccurate knowledge learned from pre-training. However, KEs can be used for malicious applications, e.g., inserting misinformation and toxic content. Knowing whether a generated output is based on edited knowledge or first-hand knowledge from pre-training can increase users' trust in generative models and provide more transparency. Driven by this, we propose a novel task: detecting edited knowledge in language models. Given an edited model and a fact retrieved by a prompt from an edited model, the objective is to classify the knowledge as either unedited (based on the pre-training), or edited (based on subsequent editing). We instantiate the task with four KEs, two LLMs, and two datasets. Additionally, we propose using the hidden state representations and the probability distributions as features for the detection. Our results reveal that, using these features as inputs to a simple AdaBoost classifiers establishes a strong baseline. This classifier requires only a limited amount of data and maintains its performance even in cross-domain settings. Last, we find it more challenging to distinguish edited knowledge from unedited but related knowledge, highlighting the need for further research. Our work lays the groundwork for addressing malicious model editing, which is a critical challenge associated with the strong generative capabilities of LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.02765

Country:

North America > Canada (0.28)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.77)

Add feedback

Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding

Idrissi-Yaghir, Ahmad, Dada, Amin, Schäfer, Henning, Arzideh, Kamyar, Baldini, Giulia, Trienes, Jan, Hasin, Max, Bewersdorff, Jeanette, Schmidt, Cynthia S., Bauer, Marie, Smith, Kaleb E., Bian, Jiang, Wu, Yonghui, Schlötterer, Jörg, Zesch, Torsten, Horn, Peter A., Seifert, Christin, Nensa, Felix, Kleesiek, Jens, Friedrich, Christoph M.

arXiv.org Artificial IntelligenceMay-8-2024

Recent advances in natural language processing (NLP) can be largely attributed to the advent of pre-trained language models such as BERT and RoBERTa. While these models demonstrate remarkable performance on general datasets, they can struggle in specialized domains such as medicine, where unique domain-specific terminologies, domain-specific abbreviations, and varying document structures are common. This paper explores strategies for adapting these models to domain-specific requirements, primarily through continuous pre-training on domain-specific data. We pre-trained several German medical language models on 2.4B tokens derived from translated public English medical data and 3B tokens of German clinical data. The resulting models were evaluated on various German downstream tasks, including named entity recognition (NER), multi-label classification, and extractive question answering. Our results suggest that models augmented by clinical and translation-based pre-training typically outperform general domain models in medical contexts. We conclude that continuous pre-training has demonstrated the ability to match or even exceed the performance of clinical models trained from scratch. Furthermore, pre-training on clinical data or leveraging translated texts have proven to be reliable methods for domain adaptation in medical NLP tasks.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2404.05694

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Information Technology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)

Add feedback