Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models

Xu, Ran, Cui, Hejie, Yu, Yue, Kan, Xuan, Shi, Wenqi, Zhuang, Yuchen, Jin, Wei, Ho, Joyce, Yang, Carl

arXiv.org Artificial Intelligence 

Clinical natural language processing requires methods that can address domainspecific challenges, such as complex medical terminology and clinical contexts. Recently, large language models (LLMs) have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation using LLMs for clinical NLP tasks. Our model involves clinical knowledge extraction and context-informed LLM prompting. Both clinical topics and writing styles are drawn from external domainspecific knowledge graphs and LLMs to guide data generation. Clinical Natural Language Processing (NLP) emerges as a distinct subfield including the extraction, analysis, and interpretation of medical data from unstructured clinical text (Wornow et al., 2023). Despite its significance, unique challenges evolve for methodology development in clinical NLP. For example, clinical texts are often dense with abbreviations and specialized medical terminologies that can be perplexing to standard NLP models (Cui et al., 2022; Lee et al., 2023). These progresses inspire the need for designing specialized approaches for adapting LLMs to clinical settings, which both address the terminology complexities and improve models through clinical data finetuning (Tu et al., 2023; Liu et al., 2023a). Despite the strong capacity of general LLMs, directly applying them to infer over clinical text data is often undesired in practice. Firstly, these LLMs often have billions of parameters that translate to significant computational resources even for inference, leading to increased infrastructure costs and long inference time. Furthermore, the sensitive patient information contained in the clinical text naturally raises privacy and regulatory compliance concerns (Meskó & Topol, 2023; Keeling, 2023).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found