CREAM: Comparison-Based Reference-Free ELO-Ranked Automatic Evaluation for Meeting Summarization

Gong, Ziwei, Ai, Lin, Deshpande, Harshsaiprasad, Johnson, Alexander, Phung, Emmy, Wu, Zehui, Emami, Ahmad, Hirschberg, Julia

Sep-17-2024–arXiv.org Artificial Intelligence

The rapid advancement of Large Language Models In this paper, we address this gap by developing (LLMs) has significantly influenced the field of automatic a new evaluation framework tailored specifically evaluation for text summarization. LLMs for meeting summarization.We propose offer the potential to streamline the evaluation process, CREAM (Comparison-based Reference-free Eloranked making it faster and more cost-effective compared Automatic evaluation for Meeting summarization), to traditional human evaluation (Liu et al., a novel system designed to fill the gaps in 2023; Wang et al., 2023). However, despite the specialized and customizable evaluation for meeting progress in automatic evaluation techniques, existing summaries as illustrated in Figure 1. Our research methods primarily target general-purpose summarization addresses the following key questions: tasks, which typically involve shorter, 1. Do current LLM-based automatic evaluators more straightforward text inputs, which may not work effectively for meeting summarization?

evaluation, key fact, summarization, (14 more...)

arXiv.org Artificial Intelligence

Sep-17-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe
  - Monaco (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Singapore (0.05)
  - Middle East > Jordan (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Information Technology > Security & Privacy (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.74)