AITopics | gatortrongpt

Collaborating Authors

gatortrongpt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Study of Large Language Models for Patient Information Extraction: Model Architecture, Fine-Tuning Strategy, and Multi-task Instruction Tuning

Peng, Cheng, Dong, Xinyu, Lyu, Mengxian, Paredes, Daniel, Zhang, Yaoyun, Wu, Yonghui

arXiv.org Artificial IntelligenceSep-8-2025

Keywords: Clinical information extraction Large language model Clinical concept extraction Clinical relation extraction Instruction tuning ABSTRACT Background N atural language processing (NLP) is a key technology t o extract important patient information from clinical narratives to support healthcare applications. The r apid development of large language models (LLMs) has revolutionized many NLP tasks in the clinical domain, yet their optimal use in patient information extraction tasks requires further exploration . This study examines LLMs ' effectiveness in patient information extraction, focusing on LLM architectures, fine - tuning strategies, and multi - task instruction tuning techniques for developing robust and generalizable patient information extraction systems . Methods This study aims to explore k ey concept s of using LLMs for clinical concept and relation extraction tasks, includ ing: ( 1) encoder - only or decoder - only LLMs, ( 2) prompt - based parameter - efficient fine - tuning (PEFT) algorithms, and ( 3) multi - task instruction tuning on few - shot learning performance . We benchmarked a suite of LLMs, including encoder - based LLMs (BERT, GatorTron) and decoder - based LLMs (GatorTronGPT, Llama 3.1, GatorTronLlama), across five datasets. We compared traditional full - size fine - tuning and prompt - based PEFT . W e explored a multi - task instruction tuning framework that combines both tasks across four datasets to evaluate the zero - shot and few - shot learning performance using the leave - one - dataset - out strategy . Results For single - task clinical CE, t he two decoder - based LLMs (Llama 3.1 and GatorTronLlama) achieved the best performance, with average F1 score s of 0.8964 and 0.8981, respectively, across the five datasets, outperforming other LLMs with average F1 improvement of 0.7~3.3%. E ncoder - based LLMs with prompt - based learning outperformed those implemented using classification .

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.04753

Country: North America > United States > Florida > Alachua County > Gainesville (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.94)
Health & Medicine > Health Care Providers & Services (0.93)
Health & Medicine > Health Care Technology > Medical Record (0.70)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

UF-HOBI at "Discharge Me!": A Hybrid Solution for Discharge Summary Generation Through Prompt-based Tuning of GatorTronGPT Models

Lyu, Mengxian, Peng, Cheng, Paredes, Daniel, Chen, Ziyi, Chen, Aokun, Bian, Jiang, Wu, Yonghui

arXiv.org Artificial IntelligenceJul-22-2024

Automatic generation of discharge summaries presents significant challenges due to the length of clinical documentation, the dispersed nature of patient information, and the diverse terminology used in healthcare. This paper presents a hybrid solution for generating discharge summary sections as part of our participation in the "Discharge Me!" Challenge at the BioNLP 2024 Shared Task. We developed a two-stage generation method using both extractive and abstractive techniques, in which we first apply name entity recognition (NER) to extract key clinical concepts, which are then used as input for a prompt-tuning-based GatorTronGPT model to generate coherent text for two important sections including "Brief Hospital Course" and "Discharge Instructions". Our system was ranked 5th in this challenge, achieving an overall score of 0.284. The results demonstrate the effectiveness of our hybrid solution in improving the quality of automated discharge section generation.

gatortrongpt, information, target section, (12 more...)

arXiv.org Artificial Intelligence

2407.15359

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.96)
Health & Medicine > Health Care Technology (0.94)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

Automatic Summarization of Doctor-Patient Encounter Dialogues Using Large Language Model through Prompt Tuning

Lyu, Mengxian, Peng, Cheng, Li, Xiaohan, Balian, Patrick, Bian, Jiang, Wu, Yonghui

arXiv.org Artificial IntelligenceMar-19-2024

Automatic text summarization (ATS) is an emerging technology to assist clinicians in providing continuous and coordinated care. This study presents an approach to summarize doctor-patient dialogues using generative large language models (LLMs). We developed prompt-tuning algorithms to instruct generative LLMs to summarize clinical text. We examined the prompt-tuning strategies, the size of soft prompts, and the few-short learning ability of GatorTronGPT, a generative clinical LLM developed using 277 billion clinical and general English words with up to 20 billion parameters. We compared GatorTronGPT with a previous solution based on fine-tuning of a widely used T5 model, using a clinical benchmark dataset MTS-DIALOG. The experimental results show that the GatorTronGPT- 20B model achieved the best performance on all evaluation metrics. The proposed solution has a low computing cost as the LLM parameters are not updated during prompt-tuning. This study demonstrates the efficiency of generative clinical LLMs for clinical ATS through prompt tuning.

gatortrongpt, llm, summarization, (15 more...)

arXiv.org Artificial Intelligence

2403.13089

Country: North America > United States > Florida > Alachua County > Gainesville (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Consumer Health (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generative Large Language Models Are All-purpose Text Analytics Engines: Text-to-text Learning Is All Your Need

Peng, Cheng, Yang, Xi, Chen, Aokun, Yu, Zehao, Smith, Kaleb E, Costa, Anthony B, Flores, Mona G, Bian, Jiang, Wu, Yonghui

arXiv.org Artificial IntelligenceDec-10-2023

Objective To solve major clinical natural language processing (NLP) tasks using a unified text-to-text learning architecture based on a generative large language model (LLM) via prompt tuning. Methods We formulated 7 key clinical NLP tasks as text-to-text learning and solved them using one unified generative clinical LLM, GatorTronGPT, developed using GPT-3 architecture and trained with up to 20 billion parameters. We adopted soft prompts (i.e., trainable vectors) with frozen LLM, where the LLM parameters were not updated (i.e., frozen) and only the vectors of soft prompts were updated, known as prompt tuning. We added additional soft prompts as a prefix to the input layer, which were optimized during the prompt tuning. We evaluated the proposed method using 7 clinical NLP tasks and compared them with previous task-specific solutions based on Transformer models. Results and Conclusion The proposed approach achieved state-of-the-art performance for 5 out of 7 major clinical NLP tasks using one unified generative LLM. Our approach outperformed previous task-specific transformer models by ~3% for concept extraction and 7% for relation extraction applied to social determinants of health, 3.4% for clinical concept normalization, 3.4~10% for clinical abbreviation disambiguation, and 5.5~9% for natural language inference. Our approach also outperformed a previously developed prompt-based machine reading comprehension (MRC) model, GatorTron-MRC, for clinical concept and relation extraction. The proposed approach can deliver the ``one model for all`` promise from training to deployment using a unified generative LLM.

clinical nlp task, dataset, nlp task, (14 more...)

arXiv.org Artificial Intelligence

2312.06099

Country:

North America > United States > Florida > Alachua County > Gainesville (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Wisconsin (0.04)
(6 more...)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Study of Generative Large Language Model for Medical Research and Healthcare

Peng, Cheng, Yang, Xi, Chen, Aokun, Smith, Kaleb E, PourNejatian, Nima, Costa, Anthony B, Martin, Cheryl, Flores, Mona G, Zhang, Ying, Magoc, Tanja, Lipori, Gloria, Mitchell, Duane A, Ospina, Naykky S, Ahmed, Mustafa M, Hogan, William R, Shenkman, Elizabeth A, Guo, Yi, Bian, Jiang, Wu, Yonghui

arXiv.org Artificial IntelligenceMay-22-2023

There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language processing for medical research. Synthetic NLP models trained using GatorTronGPT generated text outperform NLP models trained using real-world clinical text. Physicians Turing test using 1 (worst) to 9 (best) scale shows that there is no significant difference in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights on the opportunities and challenges of LLMs for medical research and healthcare.

clinical text, gatortrongpt, synthetic clinical text, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s41746-023-00958-w

2305.13523

Country:

North America > United States > Florida > Alachua County > Gainesville (0.29)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
North America > Dominican Republic (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.34)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback