Are Today's LLMs Ready to Explain Well-Being Concepts?
Jiang, Bohan, Li, Dawei, Tan, Zhen, Zhao, Chengshuai, Liu, Huan
–arXiv.org Artificial Intelligence
Well-being encompasses mental, physical, and social dimensions essential to personal growth and informed life decisions. As individuals increasingly consult Large Language Models (LLMs) to understand well-being, a key challenge emerges: Can LLMs generate explanations that are not only accurate but also tailored to diverse audiences? High-quality explanations require both factual correctness and the ability to meet the expectations of users with varying expertise. In this work, we construct a large-scale dataset comprising 43,880 explanations of 2,194 well-being concepts, generated by ten diverse LLMs. We introduce a principle-guided LLM-as-a-judge evaluation framework, employing dual judges to assess explanation quality. Furthermore, we show that fine-tuning an open-source LLM using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) can significantly enhance the quality of generated explanations. Our results reveal: (1) The proposed LLM judges align well with human evaluations; (2) explanation quality varies significantly across models, audiences, and categories; and (3) DPO- and SFT-finetuned models outperform their larger counterparts, demonstrating the effectiveness of preference-based learning for specialized explanation tasks.
arXiv.org Artificial Intelligence
Aug-7-2025
- Country:
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Asia > Myanmar
- Tanintharyi Region > Dawei (0.04)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.04)
- North America > United States
- Arizona (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Africa > Ethiopia
- Genre:
- Research Report > New Finding (0.66)
- Industry:
- Technology: