Evaluation of Large Language Models for Summarization Tasks in the Medical Domain: A Narrative Review
Croxford, Emma, Gao, Yanjun, Pellegrino, Nicholas, Wong, Karen K., Wills, Graham, First, Elliot, Liao, Frank J., Goswami, Cherodeep, Patterson, Brian, Afshar, Majid
–arXiv.org Artificial Intelligence
Large Language Models have advanced clinical Natural Language Generation, creating opportunities to manage the volume of medical text. However, the high-stakes nature of medicine requires reliable evaluation, which remains a challenge. In this narrative review, we assess the current evaluation state for clinical summarization tasks and propose future directions to address the resource constraints of expert human evaluation.
arXiv.org Artificial Intelligence
Sep-26-2024
- Country:
- Asia (1.00)
- Europe (1.00)
- North America
- Canada > British Columbia
- United States
- Massachusetts (0.28)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Wisconsin > Dane County
- Madison (0.14)
- Genre:
- Research Report (1.00)
- Industry:
- Technology: