Training Optimus Prime, M.D.: Generating Medical Certification Items by Fine-Tuning OpenAI's gpt2 Transformer Model

Aug-29-2019–arXiv.org Artificial Intelligence

Training Optimus Prime, M.D.: Generating Medical Certification Items by Fine-Tuning OpenAI's gpt2 Transformer Model Matthias von Davier August 21st, 2019 Abstract Objective: Showcasing Artificial Intelligence, in particular deep neural networks, for language modeling aimed at automated generation of medical education test items. Materials and Methods: OpenAI's gpt2 transformer language model was retrained using PubMed's open access text mining database. The retraining was done using toolkits based on tensorflow-gpu available on GitHub, using a workstation equipped with two GPUs. Results: In comparison to a study that used character based recurrent neural networks trained on open access items, the retrained transformer architecture allows generating higher quality text that can be used as draft input for medical education assessment material. In addition, prompted text generation can be used for production of distractors suitable for multiple choice items used in certification exams. Discussion: The current state of neural network based language models can be used to develop tools in supprt of authoring medical education exams using retrained models on the basis of corpora consisting of general medical text collections. Conclusion: Future experiments with more recent transformer models (such as Grover, TransformerXL) using existing medical certification exam item pools is expected to further improve results and facilitate the development of assessment materials. Objective The aim of this article is to provide evidence on the current state of automated item generation (AIG) using deep neural networks (DNNs). Based on earlier work, a first paper that tackled this issue used character-based Address for correspondence: mvondavier@nbme.org: Time flies in the domain of DNNs used for language modeling, indeed: The day this paper was submitted, on August 13th, 2019, to internal review, NVIDIA published yet another, larger language model of the transformer used in this paper. The MegratronLM (apart from taking a bite out of the pun in this article's title) is currently the largest language model based on the transformer architecture [3]. This latest neural network language model has 8 billions of parameters, which is incomprehensible compared to the type of neural networks we used only two decades ago. At that time, in winter semester 1999-2000, I taught classes about artificial Neural Networks (NNs, e.g. Back then, Artificial Intelligence (AI) already entered what was referred to as AI winter, as most network sizes were limited to rather small architectures unless supercomputers were employed.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Aug-29-2019

arXiv.org PDF

Add feedback

Country:
- Asia > Taiwan (0.04)
- Europe > Germany (0.04)
- North America > United States
  - New Jersey > Bergen County
    - Mahwah (0.04)
  - Pennsylvania > Philadelphia County
    - Philadelphia (0.04)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Education > Educational Setting
  - Higher Education (0.74)
- Health & Medicine
  - Consumer Health (1.00)
  - Pharmaceuticals & Biotechnology (0.94)
  - Therapeutic Area
    - Cardiology/Vascular Diseases (1.00)
    - Gastroenterology (0.72)
    - Infections and Infectious Diseases (0.96)
    - Musculoskeletal (1.00)
    - Neurology > Headaches (0.46)
- Information Technology (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.81)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found