Linguistic trajectories of bipolar disorder on social media
Plank, Laurin, Zlomuzica, Armin
–arXiv.org Artificial Intelligence
Correspondence should be addressed to: Laurin Plank. This paper has not yet been peer - reviewed Abstract Language provides valuable markers of affective disorders such as bipolar disorder (BD), yet clinical assessments remain limited in scale. In response, analyses of social media (SM) language have gained prominence due to their high temporal resolution and longitudinal scope. Here, we introduce a method to determine the timing of users' diagnoses and apply it to study language trajectories from 3 years before to 21 years after BD diagnosis - contrasted with uses reporting unipolar depression (UD) and non - aff ected users (HC). We show that BD diagnosis is accompanied by pervasive linguistic alterations reflecting mood disturbance, psychiatric comorbidity, substance abuse, hospitalization, medical comorbidities, unusual thought content, and disorganized thought. W e further observe recurring mood - related language change s across two decades after the diagnosis, with a pronounced 12 - month periodicity suggestive of seasonal mood episodes. Finally, trend - level evidence suggests an increased periodicity in users estima ted to be female. In sum, our findings provide evidence for language alterations in the acute and chronic phase of BD. Th i s validates and extends recent efforts leveraging SM for scalable monitoring of mental health. Knowledge of diagnosis events allows language alterations to be contextualized with respect to the current disorder phase . For example, it would allow comparing language change from a premorbid to the acute disorder phase, or to study long - term behavioral patterns in the chronic disorder phase . W e then use the resulting digital clinical cohorts (DICCs) to study longitudinal language trajectories in users who self - disclose having been diagnosed with BD. This time information is then passed to SUTime, a temporal parsing algorithm, which yielded normalized datetime information. T hese data are additionally filtered through a rule - based algorithm to exclude non - viable datetimes (e.g., those including seasonal information such as "spring, 2022"). Pseudo - diagnoses are assigned to a group of regular Reddit users who served as a healthy control group (HC). Fig . 1 gives an overview of the DICC s pipeline.
arXiv.org Artificial Intelligence
Sep-15-2025
- Country:
- Europe
- Germany (0.04)
- Netherlands > South Holland
- Dordrecht (0.04)
- North America > United States (0.14)
- Europe
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Technology: