SageLM: A Multi-aspect and Explainable Large Language Model for Speech Judgement

Ge, Yuan, Zhang, Junxiang, Liu, Xiaoqian, Li, Bei, Ma, Xiangnan, Wang, Chenglong, Ye, Kaiyang, Du, Yangfan, Zhang, Linfeng, Huang, Yuxin, Xiao, Tong, Yu, Zhengtao, Zhu, JingBo

Nov-11-2025–arXiv.org Artificial Intelligence

Speech-to-Speech (S2S) Large Language Models (LLMs) are foundational to natural human-computer interaction, enabling end-to-end spoken dialogue systems. However, evaluating these models remains a fundamental challenge. We propose \texttt{SageLM}, an end-to-end, multi-aspect, and explainable speech LLM for comprehensive S2S LLMs evaluation. First, unlike cascaded approaches that disregard acoustic features, SageLM jointly assesses both semantic and acoustic dimensions. Second, it leverages rationale-based supervision to enhance explainability and guide model learning, achieving superior alignment with evaluation outcomes compared to rule-based reinforcement learning methods. Third, we introduce \textit{SpeechFeedback}, a synthetic preference dataset, and employ a two-stage training paradigm to mitigate the scarcity of speech preference data. Trained on both semantic and acoustic dimensions, SageLM achieves an 82.79\% agreement rate with human evaluators, outperforming cascaded and SLM-based baselines by at least 7.42\% and 26.20\%, respectively.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Nov-11-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Florida
    - Miami-Dade County > Miami (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
- Asia
  - Thailand > Bangkok
    - Bangkok (0.04)
  - China
    - Yunnan Province > Kunming (0.04)
    - Shanghai > Shanghai (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Health & Medicine
  - Consumer Health (1.00)
  - Therapeutic Area > Psychiatry/Psychology
    - Mental Health (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found