Detecting PTSD in Clinical Interviews: A Comparative Analysis of NLP Methods and Large Language Models

Chen, Feng, Ben-Zeev, Dror, Sparks, Gillian, Kadakia, Arya, Cohen, Trevor

arXiv.org Artificial Intelligence 

Post - Traumatic Stress Disorder (PTSD) remains underdiagnosed in clinical settings, presenting opportunities for automated detection to identify patients . This study evaluates natural language processing approaches for detecting PTSD from clinical interview transcripts. We compared general and mental health - specific transformer models (BERT/RoBERTa), embedding - based methods (SentenceBERT/ LLaMA), and large language model prompting strategies (zero - shot/few - shot/chain - of - thought) using the DAIC - WOZ dataset. Do main - specific models significantly outperformed general models (Mental - RoBERTa F1=0.643 vs. RoBERTa - base 0.485) . LLaMA embeddings with neural networks achieved the highest performance (F1=0.700) . Zero - shot prompting using DSM - 5 criteria yielded competitive results without training data (F1=0.657 Performance varied significantly across symptom severity and comorbidity status, with higher accuracy for severe PTSD cases and patients with comorbid depression. Our findings highlight the potential of domain - adapted embeddings and LLMs for scalable scr eening while underscoring the need for improved detection of nuanced presentations and offering insights for developing clinically viable AI tools for PTSD assessment . Introduction Post - Traumatic Stress Disorder (PTSD) affects approximately 6% of the U.S. population, with significantly higher rates among veterans and trauma survivors. Despite its prevalence, PTSD remains underdiagnosed in primary care settings, with studies suggesting that around 30 % of cases go unrecognized.