Weakly Supervised Text Classification on Free Text Comments in Patient-Reported Outcome Measures
Linton, Anna-Grace, Dimitrova, Vania, Downing, Amy, Wagland, Richard, Glaser, Adam
–arXiv.org Artificial Intelligence
Free text comments (FTC) in patient-reported outcome measures (PROMs) data are typically analysed using manual methods, such as content analysis, which is labour-intensive and time-consuming. Machine learning analysis methods are largely unsupervised, necessitating post-analysis interpretation. Weakly supervised text classification (WSTC) can be a valuable method of analysis to classify domain-specific text data in which there is limited labelled data. In this paper, we apply five WSTC techniques to FTC in PROMs data to identify health-related quality of life (HRQoL) themes reported by colorectal cancer patients. The WSTC methods label all the themes mentioned in the FTC. The results showed moderate performance on the PROMs data, mainly due to the precision of the models, and variation between themes. Evaluation of the classification performance illustrated the potential and limitations of keyword based WSTC to label PROMs FTC when labelled data is limited.
arXiv.org Artificial Intelligence
Aug-11-2023
- Country:
- North America
- Dominican Republic (0.04)
- United States
- New York > New York County
- New York City (0.04)
- Florida > Orange County
- Orlando (0.04)
- New York > New York County
- Europe
- United Kingdom
- Wales (0.04)
- England
- West Yorkshire > Leeds (0.04)
- Hampshire > Southampton (0.04)
- Greater Manchester > Manchester (0.04)
- Italy > Piedmont
- Turin Province > Turin (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- United Kingdom
- Asia > Japan
- Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
- North America
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (0.35)
- Technology: