AITopics | Schiller, Benjamin

Collaborating Authors

Schiller, Benjamin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Argument Summarization and its Evaluation in the Era of Large Language Models

Altemeyer, Moritz, Eger, Steffen, Daxenberger, Johannes, Altendorf, Tim, Cimiano, Philipp, Schiller, Benjamin

arXiv.org Artificial IntelligenceMar-17-2025

Large Language Models (LLMs) have revolutionized various Natural Language Generation (NLG) tasks, including Argument Summarization (ArgSum), a key subfield of Argument Mining (AM). This paper investigates the integration of state-of-the-art LLMs into ArgSum, including for its evaluation. In particular, we propose a novel prompt-based evaluation scheme, and validate it through a novel human benchmark dataset. Our work makes three main contributions: (i) the integration of LLMs into existing ArgSum frameworks, (ii) the development of a new LLM-based ArgSum system, benchmarked against prior methods, and (iii) the introduction of an advanced LLM-based evaluation scheme. We demonstrate that the use of LLMs substantially improves both the generation and evaluation of argument summaries, achieving state-of-the-art results and advancing the field of ArgSum.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2503.00847

Country:

Europe (1.00)
North America > United States > California (0.14)
North America > Mexico > Mexico City (0.14)
Asia > Middle East > Qatar (0.14)

Genre:

Research Report (0.90)
Overview (0.68)

Industry:

Energy (1.00)
Health & Medicine > Therapeutic Area > Vaccines (0.46)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Add feedback

Diversity Over Size: On the Effect of Sample and Topic Sizes for Argument Mining Datasets

Schiller, Benjamin, Daxenberger, Johannes, Gurevych, Iryna

arXiv.org Artificial IntelligenceJul-15-2023

The task of Argument Mining, that is extracting argumentative sentences for a specific topic from large document sources, is an inherently difficult task for machine learning models and humans alike, as large Argument Mining datasets are rare and recognition of argumentative sentences requires expert knowledge. The task becomes even more difficult if it also involves stance detection of retrieved arguments. Given the cost and complexity of creating suitably large Argument Mining datasets, we ask whether it is necessary for acceptable performance to have datasets growing in size. Our findings show that, when using carefully composed training samples and a model pretrained on related tasks, we can reach 95% of the maximum performance while reducing the training sample size by at least 85%. This gain is consistent across three Argument Mining tasks on three different datasets. We also publish a new dataset for future benchmarking.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2205.11472

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (1.00)
Energy > Renewable (0.93)
(7 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Crowdsourcing on Sensitive Data with Privacy-Preserving Text Rewriting

Mouhammad, Nina, Daxenberger, Johannes, Schiller, Benjamin, Habernal, Ivan

arXiv.org Artificial IntelligenceMar-6-2023

Most tasks in NLP require labeled data. Data labeling is often done on crowdsourcing platforms due to scalability reasons. However, publishing data on public platforms can only be done if no privacy-relevant information is included. Textual data often contains sensitive information like person names or locations. In this work, we investigate how removing personally identifiable information (PII) as well as applying differential privacy (DP) rewriting can enable text with privacy-relevant information to be used for crowdsourcing. We find that DP-rewriting before crowdsourcing can preserve privacy while still leading to good label quality for certain tasks and data. PII-removal led to good label quality in all examined tasks, however, there are no privacy guarantees given.

artificial intelligence, natural language, social media, (19 more...)

arXiv.org Artificial Intelligence

2303.03053

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Providers & Services > Reimbursement (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Focusing Knowledge-based Graph Argument Mining via Topic Modeling

Abels, Patrick, Ahmadi, Zahra, Burkhardt, Sophie, Schiller, Benjamin, Gurevych, Iryna, Kramer, Stefan

arXiv.org Artificial IntelligenceFeb-3-2021

Decision-making usually takes five steps: identifying the problem, collecting data, extracting evidence, identifying pro and con arguments, and making decisions. Focusing on extracting evidence, this paper presents a hybrid model that combines latent Dirichlet allocation and word embeddings to obtain external knowledge from structured and unstructured data. We study the task of sentence-level argument mining, as arguments mostly require some degree of world knowledge to be identified and understood. Given a topic and a sentence, the goal is to classify whether a sentence represents an argument in regard to the topic. We use a topic model to extract topic- and sentence-specific evidence from the structured knowledge base Wikidata, building a graph based on the cosine similarity between the entity word vectors of Wikidata and the vector of the given sentence. Also, we build a second graph based on topic-specific articles found via Google to tackle the general incompleteness of structured knowledge bases. Combining these graphs, we obtain a graph-based model which, as our evaluation shows, successfully capitalizes on both structured and unstructured data.

expert system, graph, text processing, (16 more...)

arXiv.org Artificial Intelligence

2102.02086

Country: Europe > Germany (0.28)

Genre: Research Report (1.00)

Industry:

Law (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.91)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)

Add feedback

UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification

Hanselowski, Andreas, Zhang, Hao, Li, Zile, Sorokin, Daniil, Schiller, Benjamin, Schulz, Claudia, Gurevych, Iryna

arXiv.org Artificial IntelligenceSep-3-2018

The Fact Extraction and VERification (FEVER) shared task was launched to support the development of systems able to verify claims by extracting supporting or refuting facts from raw text. The shared task organizers provide a large-scale dataset for the consecutive steps involved in claim verification, in particular, document retrieval, fact extraction, and claim classification. In this paper, we present our claim verification pipeline approach, which, according to the preliminary results, scored third in the shared task, out of 23 competing systems. For the document retrieval, we implemented a new entity linking approach. In order to be able to rank candidate facts and classify a claim on the basis of several selected facts, we introduce two extensions to the Enhanced LSTM (ESIM).

deep learning, neural network, representation, (21 more...)

arXiv.org Artificial Intelligence

1809.01479

Country: Europe (0.48)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Add feedback

A Retrospective Analysis of the Fake News Challenge Stance Detection Task

Hanselowski, Andreas, PVS, Avinesh, Schiller, Benjamin, Caspelherr, Felix, Chaudhuri, Debanjan, Meyer, Christian M., Gurevych, Iryna

arXiv.org Artificial IntelligenceJun-13-2018

The 2017 Fake News Challenge Stage 1 (FNC-1) shared task addressed a stance classification task as a crucial first step towards detecting fake news. To date, there is no in-depth analysis paper to critically discuss FNC-1's experimental setup, reproduce the results, and draw conclusions for next-generation stance classification methods. In this paper, we provide such an in-depth analysis for the three top-performing systems. We first find that FNC-1's proposed evaluation metric favors the majority class, which can be easily classified, and thus overestimates the true discriminative power of the methods. Therefore, we propose a new F1-based metric yielding a changed system ranking. Next, we compare the features and architectures used, which leads to a novel feature-rich stacked LSTM model that performs on par with the best systems, but is superior in predicting minority classes. To understand the methods' ability to generalize, we derive a new dataset and perform both in-domain and cross-domain experiments. Our qualitative and quantitative study helps interpreting the original FNC-1 scores and understand which features help improving performance and why. Our new dataset and all source code used during the reproduction study are publicly available for future research.

deep learning, neural network, proceedings, (20 more...)

arXiv.org Artificial Intelligence

1806.0518

Country:

Europe (1.00)
Asia > Middle East (0.67)
North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback