AITopics | Bauckhage, Christian

Collaborating Authors

Bauckhage, Christian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Hillebrand, Lars, Pradhan, Prabhupad, Bauckhage, Christian, Sifa, Rafet

arXiv.org Artificial IntelligenceJun-6-2024

We introduce "pointer-guided segment ordering" (SO), a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations in large language models. Our methodology leverages a self-attention-driven pointer network to restore the original sequence of shuffled text segments, addressing the challenge of capturing the structural coherence and contextual dependencies within documents. This pre-training approach is complemented by a fine-tuning methodology that incorporates dynamic sampling, augmenting the diversity of training instances and improving sample efficiency for various downstream applications. We evaluate our method on a diverse set of datasets, demonstrating its efficacy in tasks requiring sequential text classification across scientific literature and financial reporting domains. Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures, leading to state-of-the-art performance in downstream classification tasks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.04156

Country: Europe > Germany > North Rhine-Westphalia (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SugarViT -- Multi-objective Regression of UAV Images with Vision Transformers and Deep Label Distribution Learning Demonstrated on Disease Severity Prediction in Sugar Beet

Günder, Maurice, Yamati, Facundo Ramón Ispizua, Alcántara, Abel Andree Barreto, Mahlein, Anne-Katrin, Sifa, Rafet, Bauckhage, Christian

arXiv.org Artificial IntelligenceFeb-1-2024

Remote sensing and artificial intelligence are pivotal technologies of precision agriculture nowadays. The efficient retrieval of large-scale field imagery combined with machine learning techniques shows success in various tasks like phenotyping, weeding, cropping, and disease control. This work will introduce a machine learning framework for automatized large-scale plant-specific trait annotation for the use case disease severity scoring for Cercospora Leaf Spot (CLS) in sugar beet. With concepts of Deep Label Distribution Learning (DLDL), special loss functions, and a tailored model architecture, we develop an efficient Vision Transformer based model for disease severity scoring called SugarViT. One novelty in this work is the combination of remote sensing data with environmental parameters of the experimental sites for disease severity prediction. Although the model is evaluated on this special use case, it is held as generic as possible to also be applicable to various image-based classification and regression tasks. With our framework, it is even possible to learn models on multi-objective problems as we show by a pretraining on environmental metadata.

artificial intelligence, information, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2311.03076

Country: Europe > Germany (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Food & Agriculture > Agriculture (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.82)

Add feedback

Controlled Randomness Improves the Performance of Transformer Models

Deußer, Tobias, Zhao, Cong, Krämer, Wolfgang, Leonhard, David, Bauckhage, Christian, Sifa, Rafet

arXiv.org Artificial IntelligenceOct-20-2023

The emergence of pre-trained transformer models brought a massive breakthrough in the field of natural language processing. During pre-training, such transformer models can learn generic language representations with strong generalization capabilities by applying a self-supervised learning approach and leveraging large text corpora. These pretrained language models can be fine-tuned in various downstream tasks without needing to train from scratch compared to traditional training methods, significantly reducing training costs while achieving excellent performance. Models like BERT Devlin et al. (2019), ELECTRA Clark et al. (2020), or T5 Raffel et al. (2020) have achieved remarkable results on several language processing tasks and the most recent developments of even larger language models, made prominent by GPT-3 Brown et al. (2020) and GPT-4 OpenAI (2023) but not limited to these two

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2310.13526

Country: Europe > Germany > North Rhine-Westphalia (0.14)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Informed Named Entity Recognition Decoding for Generative Language Models

Deußer, Tobias, Hillebrand, Lars, Bauckhage, Christian, Sifa, Rafet

arXiv.org Artificial IntelligenceAug-15-2023

Ever-larger language models with ever-increasing capabilities are by now well-established text processing tools. Alas, information extraction tasks such as named entity recognition are still largely unaffected by this progress as they are primarily based on the previous generation of encoder-only transformer models. Here, we propose a simple yet effective approach, Informed Named Entity Recognition Decoding (iNERD), which treats named entity recognition as a generative process. It leverages the language understanding capabilities of recent generative models in a future-proof manner and employs an informed decoding scheme incorporating the restricted nature of information extraction into open-ended text generation, improving performance and eliminating any risk of hallucinations. We coarse-tune our model on a merged named entity corpus to strengthen its performance, evaluate five generative language models on eight named entity recognition datasets, and achieve remarkable results, especially in an environment with an unknown entity class set, demonstrating the adaptability of the approach.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2308.07791

Country:

North America (0.28)
Europe > Germany > North Rhine-Westphalia (0.14)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models

Hillebrand, Lars, Berger, Armin, Deußer, Tobias, Dilmaghani, Tim, Khaled, Mohamed, Kliem, Bernd, Loitz, Rüdiger, Pielka, Maren, Leonhard, David, Bauckhage, Christian, Sifa, Rafet

arXiv.org Artificial IntelligenceAug-14-2023

Auditing financial documents is a very tedious and time-consuming process. As of today, it can already be simplified by employing AI-based solutions to recommend relevant text passages from a report for each legal requirement of rigorous accounting standards. However, these methods need to be fine-tuned regularly, and they require abundant annotated data, which is often lacking in industrial environments. Hence, we present ZeroShotALI, a novel recommender system that leverages a state-of-the-art large language model (LLM) in conjunction with a domain-specifically optimized transformer-based text-matching solution. We find that a two-step approach of first retrieving a number of best matching document sections per legal requirement with a custom BERT-based model and second filtering these selections using an LLM yields significant performance improvements over existing approaches.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2308.06111

Country:

Europe > Ireland (0.16)
Europe > Germany (0.15)
North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Empirical Evaluation of the Rashomon Effect in Explainable Machine Learning

Müller, Sebastian, Toborek, Vanessa, Beckh, Katharina, Jakobs, Matthias, Bauckhage, Christian, Welke, Pascal

arXiv.org Artificial IntelligenceJun-29-2023

The Rashomon Effect describes the following phenomenon: for a given dataset there may exist many models with equally good performance but with different solution strategies. The Rashomon Effect has implications for Explainable Machine Learning, especially for the comparability of explanations. We provide a unified view on three different comparison scenarios and conduct a quantitative evaluation across different datasets, models, attribution methods, and metrics. We find that hyperparameter-tuning plays a role and that metric selection matters. Our results provide empirical support for previously anecdotal evidence and exhibit challenges for both scientists and practitioners.

explanation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.15786

Country:

Europe > Austria (0.28)
Europe > Germany > North Rhine-Westphalia (0.28)

Genre: Research Report > New Finding (0.35)

Industry: Health & Medicine (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

A New Aligned Simple German Corpus

Toborek, Vanessa, Busch, Moritz, Boßert, Malte, Bauckhage, Christian, Welke, Pascal

arXiv.org Artificial IntelligenceMay-26-2023

"Leichte Sprache", the German counterpart to Simple English, is a regulated language aiming to facilitate complex written language that would otherwise stay inaccessible to different groups of people. We present a new sentence-aligned monolingual corpus for Simple German -- German. It contains multiple document-aligned sources which we have aligned using automatic sentence-alignment methods. We evaluate our alignments based on a manually labelled subset of aligned documents. The quality of our sentence alignments, as measured by F1-score, surpasses previous work. We publish the dataset under CC BY-SA and the accompanying code under MIT license.

information retrieval, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2209.01106

Country: Europe > Germany (0.28)

Genre:

Research Report (1.00)
Overview (0.68)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)

Add feedback

Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

Ramamurthy, Rajkumar, Ammanabrolu, Prithviraj, Brantley, Kianté, Hessel, Jack, Sifa, Rafet, Bauckhage, Christian, Hajishirzi, Hannaneh, Choi, Yejin

arXiv.org Artificial IntelligenceFeb-28-2023

We tackle the problem of aligning pre-trained large language models (LMs) with human preferences. If we view text generation as a sequential decision-making problem, reinforcement learning (RL) appears to be a natural conceptual framework. However, using RL for LM-based generation faces empirical challenges, including training instability due to the combinatorial action space, as well as a lack of open-source libraries and benchmarks customized for LM alignment. Thus, a question rises in the research community: is RL a practical paradigm for NLP? To help answer this, we first introduce an open-source modular library, RL4LMs (Reinforcement Learning for Language Models), for optimizing language generators with RL. The library consists of on-policy RL algorithms that can be used to train any encoder or encoder-decoder LM in the HuggingFace library (Wolf et al. 2020) with an arbitrary reward function. Next, we present the GRUE (General Reinforced-language Understanding Evaluation) benchmark, a set of 6 language generation tasks which are supervised not by target strings, but by reward functions which capture automated measures of human preference. GRUE is the first leaderboard-style evaluation of RL algorithms for NLP tasks. Finally, we introduce an easy-to-use, performant RL algorithm, NLPO (Natural Language Policy Optimization) that learns to effectively reduce the combinatorial action space in language generation. We show 1) that RL techniques are generally better than supervised methods at aligning LMs to human preferences; and 2) that NLPO exhibits greater stability and performance than previous policy gradient methods (e.g., PPO (Schulman et al. 2017)), based on both automatic and human evaluations.

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2210.01241

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.92)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Industry:

Transportation > Air (1.00)
Media > Film (1.00)
Leisure & Entertainment > Sports > Soccer (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

KPI-EDGAR: A Novel Dataset and Accompanying Metric for Relation Extraction from Financial Documents

Deußer, Tobias, Ali, Syed Musharraf, Hillebrand, Lars, Nurchalifah, Desiana, Jacob, Basil, Bauckhage, Christian, Sifa, Rafet

arXiv.org Artificial IntelligenceOct-17-2022

We introduce KPI-EDGAR, a novel dataset for Joint Named Entity Recognition and Relation Extraction building on financial reports uploaded to the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system, where the main objective is to extract Key Performance Indicators (KPIs) from financial documents and link them to their numerical values and other attributes. We further provide four accompanying baselines for benchmarking potential future research. Additionally, we propose a new way of measuring the success of said extraction process by incorporating a word-level weighting scheme into the conventional F1 score to better model the inherently fuzzy borders of the entity pairs of a relation in this domain.

extraction, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICMLA55696.2022.00254

2210.09163

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Street-Map Based Validation of Semantic Segmentation in Autonomous Driving

von Rueden, Laura, Wirtz, Tim, Hueger, Fabian, Schneider, Jan David, Piatkowski, Nico, Bauckhage, Christian

arXiv.org Artificial IntelligenceApr-15-2021

Artificial intelligence for autonomous driving must meet strict requirements on safety and robustness, which motivates the thorough validation of learned models. However, current validation approaches mostly require ground truth data and are thus both cost-intensive and limited in their applicability. We propose to overcome these limitations by a model agnostic validation using a-priori knowledge from street maps. In particular, we show how to validate semantic segmentation masks and demonstrate the potential of our approach using OpenStreetMap. We introduce validation metrics that indicate false positive or negative road segments. Besides the validation approach, we present a method to correct the vehicle's GPS position so that a more accurate localization can be used for the street-map based validation. Lastly, we present quantitative results on the Cityscapes dataset indicating that our validation approach can indeed uncover errors in semantic segmentation masks.

deep learning, neural network, segmentation, (21 more...)

arXiv.org Artificial Intelligence

2104.07538

Country: Europe > Germany (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.89)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)

Add feedback