Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data

Chen, Shan, Gallifant, Jack, Guevara, Marco, Gao, Yanjun, Afshar, Majid, Miller, Timothy, Dligach, Dmitriy, Bitterman, Danielle S.

Mar-28-2024–arXiv.org Artificial Intelligence

A common challenge for the development of clinical natural language processing (NLP) methods is the availability of large annotated datasets for model training, fine-tuning, and evaluation. Traditional annotation processes are time-consuming, expensive, and often require expert medical knowledge, creating significant research and benchmark development constraints.

dataset, synthetic data, synthetic data generation, (9 more...)

arXiv.org Artificial Intelligence

Mar-28-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Wisconsin > Dane County
    - Madison (0.04)
  - Massachusetts
    - Suffolk County > Boston (0.06)
    - Middlesex County > Cambridge (0.04)
  - Illinois > Cook County
    - Chicago (0.05)
- Europe > United Kingdom
  - England > Greater London > London (0.04)
- Asia
  - Singapore (0.04)
  - Indonesia > Bali (0.04)

Genre:
- Research Report > Experimental Study (0.42)

Industry:
- Health & Medicine
  - Therapeutic Area > Oncology (1.00)
  - Diagnostic Medicine (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.50)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found