Holistic Evaluation of GPT-4V for Biomedical Imaging

Liu, Zhengliang, Jiang, Hanqi, Zhong, Tianyang, Wu, Zihao, Ma, Chong, Li, Yiwei, Yu, Xiaowei, Zhang, Yutong, Pan, Yi, Shu, Peng, Lyu, Yanjun, Zhang, Lu, Yao, Junjie, Dong, Peixin, Cao, Chao, Xiao, Zhenxiang, Wang, Jiaqi, Zhao, Huan, Xu, Shaochen, Wei, Yaonai, Chen, Jingyuan, Dai, Haixing, Wang, Peilong, He, Hao, Wang, Zewei, Wang, Xinyu, Zhang, Xu, Zhao, Lin, Liu, Yiheng, Zhang, Kai, Yan, Liheng, Sun, Lichao, Liu, Jun, Qiang, Ning, Ge, Bao, Cai, Xiaoyan, Zhao, Shijie, Hu, Xintao, Yuan, Yixuan, Li, Gang, Zhang, Shu, Zhang, Xin, Jiang, Xi, Zhang, Tuo, Shen, Dinggang, Li, Quanzheng, Liu, Wei, Li, Xiang, Zhu, Dajiang, Liu, Tianming

Nov-10-2023–arXiv.org Artificial Intelligence

In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and more. Tasks include modality recognition, anatomy localization, disease diagnosis, report generation, and lesion detection. The extensive experiments provide insights into GPT-4V's strengths and weaknesses. Results show GPT-4V's proficiency in modality and anatomy recognition but difficulty with disease diagnosis and localization. GPT-4V excels at diagnostic report generation, indicating strong image captioning skills. While promising for biomedical imaging AI, GPT-4V requires further enhancement and validation before clinical deployment. We emphasize responsible development and testing for trustworthy integration of biomedical AGI. This rigorous evaluation of GPT-4V on diverse medical images advances understanding of multimodal large language models (LLMs) and guides future work toward impactful healthcare applications.

accurate assessment and recommendation, adult mouse coronal region transcent, professional neuroscientist and brain scientist, (13 more...)

arXiv.org Artificial Intelligence

Nov-10-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts (0.04)
  - Texas > Tarrant County
    - Arlington (0.04)
  - North Carolina > Orange County
    - Chapel Hill (0.04)
  - California > Santa Clara County
    - Sunnyvale (0.04)
  - Arizona > Maricopa County
    - Phoenix (0.04)
- Europe
  - United Kingdom > England
    - Greater Manchester > Manchester (0.04)
  - Switzerland > Zürich
    - Zürich (0.13)
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - France > Bourgogne-Franche-Comté
    - Côte-d'Or > Dijon (0.04)
- Asia > China
  - Shanghai > Shanghai (0.04)
  - Shaanxi Province > Xi'an (0.04)
  - Hong Kong (0.04)
  - Beijing > Beijing (0.04)
  - Yunnan Province > Kunming (0.04)
  - Sichuan Province > Chengdu (0.04)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Health & Medicine
  - Diagnostic Medicine > Imaging (1.00)
  - Therapeutic Area > Neurology
    - Alzheimer's Disease (0.46)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision > Image Understanding (1.00)
    - Representation & Reasoning (1.00)
    - Cognitive Science > Neuroscience (1.00)
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (0.93)
    - Machine Learning > Neural Networks
      - Deep Learning (1.00)