BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Bolton, Elliot, Venigalla, Abhinav, Yasunaga, Michihiro, Hall, David, Xiong, Betty, Lee, Tony, Daneshjou, Roxana, Frankle, Jonathan, Liang, Percy, Carbin, Michael, Manning, Christopher D.
–arXiv.org Artificial Intelligence
Large language models such as OpenAI's GPT-4 have become the dominant technology in modern natural language processing (Liu et al., 2023; Zhao et al., 2023). Trained on large corpora to predict the next token and refined with human feedback (Brown et al., 2020; Ouyang et al., 2022; Ziegler et al., 2020), these models develop impressive capabilities in areas such as summarization and questionanswering (Zhang et al., 2023; Goyal et al., 2023; Karpukhin et al., 2020). While the focus has been on these models' performance when responding to general English prompts, it is clear there is potential for specialist models to impact biomedical research and healthcare (Arora and Arora, 2023; Shah et al., 2023; Thirunavukarasu et al., 2023). Such applications include information retrieval and summarization from the ever-expanding biomedical literature (Wang et al., 2021; Yang, 2020), clinical information such as physician notes in electronic health records, and radiology reports (Murray et al., 2021; Feblowitz et al., 2011; Zhang et al., 2018). Improving domain-specific language models will help accelerate biomedical discovery, drive down healthcare costs, and improve patient care. Large, general models like GPT-4 and Med-PaLM 2 have set new standards for performance on question-answering and information extraction (Kung et al., 2022; Singhal et al., 2023a,b), but there are several drawbacks to these models. They are costly to train and utilize. Compute for training and inference of large language models have increased 10-to 100-fold since 2015 (Sevilla et al., 2022), translating to extremely high financial and
arXiv.org Artificial Intelligence
Mar-27-2024
- Country:
- North America > United States (0.68)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Health & Medicine
- Consumer Health (1.00)
- Diagnostic Medicine (0.66)
- Health Care Providers & Services (1.00)
- Health Care Technology > Medical Record (0.86)
- Pharmaceuticals & Biotechnology (1.00)
- Therapeutic Area
- Immunology (1.00)
- Infections and Infectious Diseases (1.00)
- Musculoskeletal (1.00)
- Health & Medicine
- Technology: