A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine
Xiao, Hanguang, Zhou, Feizhong, Liu, Xingyue, Liu, Tianqi, Li, Zhipeng, Liu, Xin, Huang, Xiaoxuan
–arXiv.org Artificial Intelligence
Transformer's robust parallel computing capability and self-attention mechanism enable the integration of vast amounts of training data, laying the foundation for the development of LLMs and MLLMs [160]. To date, a series of Transformer-based LLMs and MLLMs have emerged (this survey primarily focuses on the vision-language modality), such as the PaLM series [6, 34], GPT series [16, 149], and LLaMA series [192, 193] belonging to LLMs, as well as Gemini [185], GPT-4 [1], and Claude 3 [7] belonging to MLLMs. Due to their powerful capabilities in understanding, reasoning, and generation, they have achieved state-of-the-art results in various downstream tasks, including text generation, machine translation and visual question answering (VQA). LLMs and MLLMs demonstrate increasingly powerful generalization abilities, with their impact extending to the medical domain, accelerating the integration of artificial intelligence and medicine [186, 188]. Particularly, Google's Med-PaLM 2 [171] achieved a score of 86.5 in the United States Medical Licensing Examination (USMLE) [83], reaching the level of medical experts [267], further showcasing the enormous potential of LLMs in the medical field. In addition, more medical LLMs and MLLMs, such as ChatDoctor [116], LLaVA-Med [107] and XrayGLM [211], represent new avenues provided by artificial intelligence for the medical field, offering potential solutions for subsequent medical report generation [201, 202, 217], clinical diagnosis [168, 195, 212], mental health services [30, 126], and a range of other clinical applications. Despite the academic breakthrough of LLMs and MLLMs in the medical field, there are still certain challenges for hospitals to train their own medical LLMs and MLLMs and deploy them into practical clinical applications. Firstly, training requires a substantial amount of medical data, which is often costly to acquire and necessitates annotation by medical experts, while also raising concerns regarding data privacy [257], all of which will pose particular challenges to model development. Secondly, the immense parameters and computation of LLMs and MLLMs demand substantial computational resources for their training and deployment [143, 157], significantly raising the threshold for hospitals to adopt LLMs and MLLMs.
arXiv.org Artificial Intelligence
May-14-2024
- Country:
- Asia > China (0.15)
- Europe (0.45)
- North America > United States (0.24)
- Genre:
- Overview (1.00)
- Research Report (1.00)
- Industry:
- Health & Medicine
- Diagnostic Medicine > Imaging (1.00)
- Health Care Providers & Services (1.00)
- Health Care Technology > Medical Record (0.87)
- Therapeutic Area
- Immunology (0.92)
- Infections and Infectious Diseases (0.92)
- Psychiatry/Psychology (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine
- Technology: