GREEN: Generative Radiology Report Evaluation and Error Notation
Ostmeier, Sophie, Xu, Justin, Chen, Zhihong, Varma, Maya, Blankemeier, Louis, Bluethgen, Christian, Michalson, Arne Edward, Moseley, Michael, Langlotz, Curtis, Chaudhari, Akshay S, Delbrouck, Jean-Benoit
–arXiv.org Artificial Intelligence
Machine learning has enabled great progress in the automatic interpretation of images, where vision language models (VLMs) translate features of images into text (Radford et al., 2021; Liu et al., 2024). In the medical domain, patient images are interpreted by radiologists, Evaluating radiology reports is a challenging which is referred to as radiology report generation problem as factual correctness is extremely important (RRG). Automated and high-quality RRG has due to the need for accurate medical the potential to greatly reduce the repetitive work of communication about medical images. Existing radiologists, alleviate burdens arising from shortage automatic evaluation metrics either suffer of radiologists, generally improve clinical communication from failing to consider factual correctness (Kahn Jr et al., 2009), and increase the accuracy (e.g., BLEU and ROUGE) or are limited of radiology reports (Rajpurkar and Lungren, 2023). in their interpretability (e.g., F1CheXpert Commonly used evaluation metrics in RRG literature and F1RadGraph). In this paper, we introduce (Lin, 2004; Zhang et al., 2019; Smit et al., 2020; GREEN (Generative Radiology Report Evaluation Delbrouck et al., 2022) seek to evaluate a generated and Error Notation), a radiology report radiology report against a reference report written by generation metric that leverages the natural language a radiologist by leveraging simple n-grams overlap, understanding of language models to general language similarity, pathology identification identify and explain clinically significant errors within specific imaging modalities and disease classes, in candidate reports, both quantitatively and commercially-available large language models.
arXiv.org Artificial Intelligence
May-6-2024
- Country:
- Asia > Middle East
- UAE (0.14)
- North America > United States (0.15)
- Asia > Middle East
- Genre:
- Research Report > Experimental Study (0.69)
- Industry:
- Health & Medicine
- Diagnostic Medicine > Imaging (1.00)
- Nuclear Medicine (1.00)
- Health & Medicine
- Technology: