MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning

He, Sunan, Nie, Yuxiang, Chen, Zhixuan, Cai, Zhiyuan, Wang, Hongmei, Yang, Shu, Chen, Hao

Apr-23-2024–arXiv.org Artificial Intelligence

The rapid advancement of large-scale vision-language models has showcased remarkable capabilities across various tasks. However, the lack of extensive and high-quality image-text data in medicine has greatly hindered the development of large-scale medical vision-language models. In this work, we present a diagnosis-guided bootstrapping strategy that exploits both image and label information to construct vision-language datasets. Based on the constructed dataset, we developed MedDr, a generalist foundation model for healthcare capable of handling diverse medical data modalities, including radiology, pathology, dermatology, retinography, and endoscopy. Moreover, during inference, we propose a simple but effective retrieval-augmented medical diagnosis strategy, which enhances the model's generalization ability. Extensive experiments on visual question answering, medical report generation, and medical image diagnosis demonstrate the superiority of our method.

arxiv preprint arxiv, dataset, diagnosis, (13 more...)

arXiv.org Artificial Intelligence

Apr-23-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe
  - Switzerland > Zürich
    - Zürich (0.14)
  - France > Grand Est
    - Bas-Rhin > Strasbourg (0.04)
- Asia > China
  - Hong Kong (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine
  - Therapeutic Area (1.00)
  - Diagnostic Medicine > Imaging (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language
    - Large Language Model (0.69)
    - Chatbot (0.68)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found