AITopics | Hennig, Leonhard

Collaborating Authors

Hennig, Leonhard

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Entity Linking using LLMs for Automated Product Carbon Footprint Estimation

Castle, Steffen, Schneider, Julian Moreno, Hennig, Leonhard, Rehm, Georg

arXiv.org Artificial IntelligenceFeb-11-2025

Growing concerns about climate change and sustainability are driving manufacturers to take significant steps toward reducing their carbon footprints. For these manufacturers, a first step towards this goal is to identify the environmental impact of the individual components of their products. We propose a system leveraging large language models (LLMs) to automatically map components from manufacturer Bills of Materials (BOMs) to Life Cycle Assessment (LCA) database entries by using LLMs to expand on available component information. Our approach reduces the need for manual data processing, paving the way for more accessible sustainability practices.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.07418

Country: Europe > Germany (0.14)

Genre:

Research Report (0.64)
Workflow (0.47)
Overview > Growing Problem (0.35)

Industry:

Materials > Metals & Mining (0.71)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > Polymers & Plastics (0.49)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Reverse Probing: Evaluating Knowledge Transfer via Finetuned Task Embeddings for Coreference Resolution

Anikina, Tatiana, Binder, Arne, Harbecke, David, Varanasi, Stalin, Hennig, Leonhard, Ostermann, Simon, Möller, Sebastian, van Genabith, Josef

arXiv.org Artificial IntelligenceJan-31-2025

In this work, we reimagine classical probing to evaluate knowledge transfer from simple source to more complex target tasks. Instead of probing frozen representations from a complex source task on diverse simple target probing tasks (as usually done in probing), we explore the effectiveness of embeddings from multiple simple source tasks on a single target task. We select coreference resolution, a linguistically complex problem requiring contextual understanding, as focus target task, and test the usefulness of embeddings from comparably simpler tasks tasks such as paraphrase detection, named entity recognition, and relation extraction. Through systematic experiments, we evaluate the impact of individual and combined task embeddings. Our findings reveal that task embeddings vary significantly in utility for coreference resolution, with semantic similarity tasks (e.g., paraphrase detection) proving most beneficial. Additionally, representations from intermediate layers of fine-tuned models often outperform those from final layers. Combining embeddings from multiple tasks consistently improves performance, with attention-based aggregation yielding substantial gains. These insights shed light on relationships between task-specific representations and their adaptability to complex downstream tasks, encouraging further exploration of embedding-level task transfer.

aggregation, artificial intelligence, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.19316

Country:

Europe (0.94)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Symmetric Dot-Product Attention for Efficient Training of BERT Language Models

Courtois, Martin, Ostendorff, Malte, Hennig, Leonhard, Rehm, Georg

arXiv.org Artificial IntelligenceJun-19-2024

Initially introduced as a machine translation model, the Transformer architecture has now become the foundation for modern deep learning architecture, with applications in a wide range of fields, from computer vision to natural language processing. Nowadays, to tackle increasingly more complex tasks, Transformer-based models are stretched to enormous sizes, requiring increasingly larger training datasets, and unsustainable amount of compute resources. The ubiquitous nature of the Transformer and its core component, the attention mechanism, are thus prime targets for efficiency research. In this work, we propose an alternative compatibility function for the self-attention mechanism introduced by the Transformer architecture. This compatibility function exploits an overlap in the learned representation of the traditional scaled dot-product attention, leading to a symmetric with pairwise coefficient dot-product attention. When applied to the pre-training of BERT-like models, this new symmetric attention mechanism reaches a score of 79.36 on the GLUE benchmark against 78.74 for the traditional implementation, leads to a reduction of 6% in the number of trainable parameters, and reduces the number of training steps required before convergence by half.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.06366

Country:

Europe (0.68)
North America > United States > Texas (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PyTorch-IE: Fast and Reproducible Prototyping for Information Extraction

Binder, Arne, Hennig, Leonhard, Alt, Christoph

arXiv.org Artificial IntelligenceMay-16-2024

The objective of Information Extraction (IE) is to derive structured representations from unstructured or semi-structured documents. However, developing IE models is complex due to the need of integrating several subtasks. Additionally, representation of data among varied tasks and transforming datasets into task-specific model inputs presents further challenges. To streamline this undertaking for researchers, we introduce PyTorch-IE, a deep-learning-based framework uniquely designed to enable swift, reproducible, and reusable implementations of IE models. PyTorch-IE offers a flexible data model capable of creating complex data structures by integrating interdependent layers of annotations derived from various data types, like plain text or semi-structured text, and even images. We propose task modules to decouple the concerns of data representation and model-specific representations, thereby fostering greater flexibility and reusability of code. PyTorch-IE also extends support for widely used libraries such as PyTorch-Lightning for training, HuggingFace datasets for dataset reading, and Hydra for experiment configuration. Supplementary libraries and GitHub templates for the easy setup of new projects are also provided. By ensuring functionality and versatility, PyTorch-IE provides vital support to the research community engaged in Information Extraction.

computational linguistic, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2406.00007

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Overview (0.47)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools

Wang, Qianli, Anikina, Tatiana, Feldhus, Nils, van Genabith, Josef, Hennig, Leonhard, Möller, Sebastian

arXiv.org Artificial IntelligenceJan-23-2024

Interpretability tools that offer explanations in the form of a dialogue have demonstrated their efficacy in enhancing users' understanding, as one-off explanations may occasionally fall short in providing sufficient information to the user. Current solutions for dialogue-based explanations, however, require many dependencies and are not easily transferable to tasks they were not designed for. With LLMCheckup, we present an easily accessible tool that allows users to chat with any state-of-the-art large language model (LLM) about its behavior. We enable LLMs to generate all explanations by themselves and take care of intent recognition without fine-tuning, by connecting them with a broad spectrum of Explainable AI (XAI) tools, e.g. feature attributions, embedding-based similarity, and prompting strategies for counterfactual and rationale generation. LLM (self-)explanations are presented as an interactive dialogue that supports follow-up questions and generates suggestions. LLMCheckup provides tutorials for operations available in the system, catering to individuals with varying levels of expertise in XAI and supports multiple input modalities. We introduce a new parsing strategy called multi-prompt parsing substantially enhancing the parsing accuracy of LLMs. Finally, we showcase the tasks of fact checking and commonsense question answering.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2401.12576

Country:

Asia (0.68)
North America > United States > New York (0.14)
North America > United States > Louisiana (0.14)
Europe > Germany > Saarland (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Factuality Detection using Machine Translation -- a Use Case for German Clinical Text

Sumait, Mohammed Bin, Gabryszak, Aleksandra, Hennig, Leonhard, Roller, Roland

arXiv.org Artificial IntelligenceAug-17-2023

Factuality can play an important role when automatically processing clinical text, as it makes a difference if particular symptoms are explicitly not present, possibly present, not mentioned, or affirmed. In most cases, a sufficient number of examples is necessary to handle such phenomena in a supervised machine learning setting. However, as clinical text might contain sensitive information, data cannot be easily shared. In the context of factuality detection, this work presents a simple solution using machine translation to translate English data to German to train a transformer-based factuality detection model.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2308.08827

Country:

Asia (0.68)
Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Saliency Map Verbalization: Comparing Feature Importance Representations from Model-free and Instruction-based Methods

Feldhus, Nils, Hennig, Leonhard, Nasert, Maximilian Dustin, Ebert, Christopher, Schwarzenberg, Robert, Möller, Sebastian

arXiv.org Artificial IntelligenceJun-7-2023

Saliency maps can explain a neural model's predictions by identifying important input features. They are difficult to interpret for laypeople, especially for instances with many features. In order to make them more accessible, we formalize the underexplored task of translating saliency maps into natural language and compare methods that address two key challenges of this approach -- what and how to verbalize. In both automatic and human evaluation setups, using token-level attributions from text classification tasks, we compare two novel methods (search-based and instruction-based verbalizations) against conventional feature importance representations (heatmap visualizations and extractive rationales), measuring simulatability, faithfulness, helpfulness and ease of understanding. Instructing GPT-3.5 to generate saliency map verbalizations yields plausible explanations which include associations, abstractive summarization and commonsense reasoning, achieving by far the highest human ratings, but they are not faithfully capturing numeric information and are inconsistent in their interpretation of the task. In comparison, our search-based, model-free verbalization approach efficiently completes templated verbalizations, is faithful by design, but falls short in helpfulness and simulatability. Our results suggest that saliency map verbalization makes feature attribution explanations more comprehensible and less cognitively challenging to humans than conventional representations.

computational linguistic, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2210.07222

Country:

Europe (1.00)
Asia (0.68)
North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.86)

Industry:

Leisure & Entertainment (0.93)
Government (0.93)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset

Hennig, Leonhard, Thomas, Philippe, Möller, Sebastian

arXiv.org Artificial IntelligenceMay-15-2023

Relation extraction (RE) is a fundamental task in information extraction, whose extension to multilingual settings has been hindered by the lack of supervised resources comparable in size to large English datasets such as TACRED (Zhang et al., 2017). To address this gap, we introduce the MultiTACRED dataset, covering 12 typologically diverse languages from 9 language families, which is created by machine-translating TACRED instances and automatically projecting their entity annotations. We analyze translation and annotation projection quality, identify error categories, and experimentally evaluate fine-tuned pretrained mono- and multilingual language models in common transfer learning scenarios. Our analyses show that machine translation is a viable strategy to transfer RE instances, with native speakers judging more than 83% of the translated instances to be linguistically and semantically acceptable. We find monolingual RE model performance to be comparable to the English original for many of the target languages, and that multilingual models trained on a combination of English and target language data can outperform their monolingual counterparts. However, we also observe a variety of translation and annotation projection errors, both due to the MT systems and linguistic features of the target languages, such as pronoun-dropping, compounding and inflection, that degrade dataset quality and RE model performance.

artificial intelligence, computational linguistic, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.04582

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback