Auditing Counterfire: Evaluating Advanced Counterargument Generation with Evidence and Style

Verma, Preetika, Jaidka, Kokil, Churina, Svetlana

Apr-19-2024–arXiv.org Artificial Intelligence

We audited large language models (LLMs) for their ability to create evidence-based and stylistic counter-arguments to posts from the Reddit ChangeMyView dataset. We benchmarked their rhetorical quality across a host of qualitative and quantitative metrics and then ultimately evaluated them on their persuasive abilities as compared to human counter-arguments. Our evaluation is based on Counterfire: a new dataset of 32,000 counter-arguments generated from large language models (LLMs): GPT-3.5 Turbo and Koala and their fine-tuned variants, and PaLM 2, with varying prompts for evidence use and argumentative style. GPT-3.5 Turbo ranked highest in argument quality with strong paraphrasing and style adherence, particularly in `reciprocity' style arguments. However, the stylistic counter-arguments still fall short of human persuasive standards, where people also preferred reciprocal to evidence-based rebuttals. The findings suggest that a balance between evidentiality and stylistic elements is vital to a compelling counter-argument. We close with a discussion of future research directions and implications for evaluating LLM outputs.

argument, counter-argument, evaluation, (16 more...)

arXiv.org Artificial Intelligence

Apr-19-2024

arXiv.org PDF

Add feedback

Country:
- Africa > Niger (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - United States
    - Pennsylvania (0.04)
    - New York (0.04)
    - Oregon > Multnomah County
      - Portland (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Russia (0.04)
  - Germany > Hamburg (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Bulgaria > Sofia City Province
    - Sofia (0.04)
- Asia
  - Singapore (0.04)
  - Russia (0.04)
  - Middle East > Jordan (0.04)
  - India (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Media > News (1.00)
- Law (1.00)
- Government (1.00)
- Health & Medicine > Therapeutic Area
  - Immunology (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.91)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found