Med-Flamingo: a Multimodal Medical Few-shot Learner

Moor, Michael, Huang, Qian, Wu, Shirley, Yasunaga, Michihiro, Zakka, Cyril, Dalmia, Yash, Reis, Eduardo Pontes, Rajpurkar, Pranav, Leskovec, Jure

Jul-27-2023–arXiv.org Artificial Intelligence

Medicine, by its nature, is a multifaceted domain that requires the synthesis of information across various modalities. Medical generative vision-language models (VLMs) make a first step in this direction and promise many exciting clinical applications. However, existing models typically have to be fine-tuned on sizeable down-stream datasets, which poses a significant limitation as in many medical applications data is scarce, necessitating models that are capable of learning from few examples in real-time. Here we propose Med-Flamingo, a multimodal few-shot learner adapted to the medical domain. Based on OpenFlamingo-9B, we continue pre-training on paired and interleaved medical image-text data from publications and textbooks. Med-Flamingo unlocks few-shot generative medical visual question answering (VQA) abilities, which we evaluate on several datasets including a novel challenging open-ended VQA dataset of visual USMLE-style problems. Furthermore, we conduct the first human evaluation for generative medical VQA where physicians review the problems and blinded generations in an interactive app. Med-Flamingo improves performance in generative medical VQA by up to 20\% in clinician's rating and firstly enables multimodal medical few-shot adaptations, such as rationale generation. We release our model, code, and evaluation app under https://github.com/snap-stanford/med-flamingo.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

Jul-27-2023

arXiv.org PDF

Add feedback

Country:
- South America > Brazil (0.14)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine
  - Diagnostic Medicine > Imaging (1.00)
  - Therapeutic Area (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Large Language Model (0.96)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found