Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health

Liyanage, Chandreen, Garg, Muskan, Mago, Vijay, Sohn, Sunghwan

Jun-6-2023–arXiv.org Artificial Intelligence

Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD) manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative NLP models for data augmentation to enable further improvement in the pre-screening task of classifying WD. To this end, we propose a simple yet effective data augmentation approach through prompt-based Generative NLP models, and evaluate the ROUGE scores and syntactic/semantic similarity among existing interpretations and augmented data. Our approach with ChatGPT model surpasses all the other methods and achieves improvement over baselines such as Easy-Data Augmentation and Backtranslation. Introducing data augmentation to generate more training samples and balanced dataset, results in the improved F-score and the Matthew's Correlation Coefficient for upto 13.11% and 15.95%, respectively.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Jun-6-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Minnesota
    - Olmsted County > Rochester (0.04)
  - Canada > Ontario
    - Thunder Bay (0.04)

Genre:
- Research Report (0.50)

Industry:
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (0.72)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found