Gla-AI4BioMed at RRG24: Visual Instruction-tuned Adaptation for Radiology Report Generation

Zhang, Xi, Meng, Zaiqiao, Lever, Jake, Ho, Edmond S. L.

Dec-6-2024–arXiv.org Artificial Intelligence

We introduce a radiology-focused visual language model designed to generate radiology reports from chest X-rays. Building on previous findings that large language models (LLMs) can acquire multimodal capabilities when aligned with pretrained vision encoders, we demonstrate similar potential with chest X-ray images. This integration enhances the ability of model to understand and describe chest X-ray images. Our model combines an image encoder with a fine-tuned LLM based on the Vicuna-7B architecture, enabling it to generate different sections of a radiology report with notable accuracy. The training process involves a two-stage approach: (i) initial alignment of chest X-ray features with the LLM (ii) followed by fine-tuning for radiology report generation.

arxiv, dataset, preprint, (14 more...)

arXiv.org Artificial Intelligence

Dec-6-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.05)
- Europe
  - Monaco (0.04)
  - Spain
    - Aragón (0.04)
    - Catalonia > Barcelona Province
      - Barcelona (0.04)
  - Slovenia > Drava
    - Municipality of Benedikt > Benedikt (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - Thailand > Bangkok
    - Bangkok (0.04)
  - Middle East
    - UAE (0.04)
    - Jordan (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine
  - Nuclear Medicine (1.00)
  - Diagnostic Medicine > Imaging (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found