Argument Summarization and its Evaluation in the Era of Large Language Models

Altemeyer, Moritz, Eger, Steffen, Daxenberger, Johannes, Altendorf, Tim, Cimiano, Philipp, Schiller, Benjamin

Mar-17-2025–arXiv.org Artificial Intelligence

Large Language Models (LLMs) have revolutionized various Natural Language Generation (NLG) tasks, including Argument Summarization (ArgSum), a key subfield of Argument Mining (AM). This paper investigates the integration of state-of-the-art LLMs into ArgSum, including for its evaluation. In particular, we propose a novel prompt-based evaluation scheme, and validate it through a novel human benchmark dataset. Our work makes three main contributions: (i) the integration of LLMs into existing ArgSum frameworks, (ii) the development of a new LLM-based ArgSum system, benchmarked against prior methods, and (iii) the introduction of an advanced LLM-based evaluation scheme. We demonstrate that the use of LLMs substantially improves both the generation and evaluation of argument summaries, achieving state-of-the-art results and advancing the field of ArgSum.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Mar-17-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States
    - New York > New York County
      - New York City (0.04)
    - Florida > Miami-Dade County
      - Miami (0.04)
    - California
      - San Diego County > San Diego (0.04)
      - Los Angeles County > Los Angeles (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Switzerland (0.04)
  - Middle East > Malta (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Germany > Bavaria
    - Middle Franconia > Nuremberg (0.04)
  - France > Hauts-de-France
    - Nord > Lille (0.04)
  - Finland > Pirkanmaa
    - Tampere (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
- Asia
  - Singapore (0.04)
  - China > Hong Kong (0.04)
  - Indonesia > Bali (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
  - Middle East > Qatar
    - Ad-Dawhah > Doha (0.04)

Genre:
- Research Report (0.90)
- Overview (0.68)

Industry:
- Energy (1.00)
- Health & Medicine
  - Therapeutic Area > Vaccines (0.46)
  - Pharmaceuticals & Biotechnology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Statistical Learning > Clustering (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found