suicide attempt
Improving Forecasts of Suicide Attempts for Patients with Little Data
Hang, Genesis, Chen, Annie, Neveux, Hope, Nock, Matthew K., Yacoby, Yaniv
Ecological Momentary Assessment provides real-time data on suicidal thoughts and behaviors, but predicting suicide attempts remains challenging due to their rarity and patient heterogeneity. We show that single models fit to all patients perform poorly, while individualized models improve performance but still overfit to patients with limited data. To address this, we introduce Latent Similarity Gaussian Processes (LSGPs) to capture patient heterogeneity, enabling those with little data to leverage similar patients' trends. Preliminary results show promise: even without kernel-design, we outperform all but one baseline while offering a new understanding of patient similarity.
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Italy > Sardinia (0.04)
- Asia > Middle East > Lebanon (0.04)
- Information Technology > Architecture > Real Time Systems (1.00)
- Information Technology > Sensing and Signal Processing (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Handling Extreme Class Imbalance: Using GANs in Data Augmentation for Suicide Prediction
Visweswaraiah, Vaishnavi, Banerjee, Tanvi, Romine, William
Suicide prediction is the key for prevention, but real data with sufficient positive samples is rare and causes extreme class imbalance. We utilized machine learning (ML) to build the model and deep learning (DL) techniques, like Generative Adversarial Networks (GAN), to generate synthetic data samples to enhance the dataset. The initial dataset contained 656 samples, with only four positive cases, prompting the need for data augmentation. A variety of machine learning models, ranging from interpretable data models to black box algorithmic models, were used. On real test data, Logistic Regression (LR) achieved a weighted precision of 0.99, a weighted recall of 0.85, and a weighted F1 score of 0.91; Random Forest (RF) showed 0.98, 0.99, and 0.99, respectively; and Support Vector Machine (SVM) achieved 0.99, 0.76, and 0.86. LR and SVM correctly identified one suicide attempt case (sensitivity:1.0) and misclassified LR(20) and SVM (31) non-attempts as attempts (specificity: 0.85 & 0.76, respectively). RF identified 0 suicide attempt cases (sensitivity: 0.0) with 0 false positives (specificity: 1.0). These results highlight the models' effectiveness, with GAN playing a key role in generating synthetic data to support suicide prevention modeling efforts.
- North America > United States > Iowa > Johnson County > Iowa City (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Germany > Berlin (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Suicidal Comment Tree Dataset: Enhancing Risk Assessment and Prediction Through Contextual Analysis
Abstract--Suicide remains a critical global public health issue. While previous studies have provided valuable insights into detecting suicidal expressions in individual social media posts, limited attention has been paid to the analysis of longitudinal, sequential comment trees for predicting a user's evolving suicidal risk. Users, however, often reveal their intentions through historical posts and interactive comments over time. This study addresses this gap by investigating how the information in comment trees affects both the discrimination and prediction of users' suicidal risk levels. We constructed a high-quality annotated dataset, sourced from Reddit, which incorporates users' posting history and comments, using a refined four-label annotation framework based on the Columbia Suicide Severity Rating Scale (C-SSRS). Statistical analysis of the dataset, along with experimental results from Large Language Models (LLMs) experiments, demonstrates that incorporating comment trees data significantly enhances the discrimination and prediction of user suicidal risk levels. This research offers a novel insight to enhancing the detection accuracy of at-risk individuals, thereby providing a valuable foundation for early suicide intervention strategies.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.69)
Former Yahoo executive spoke with ChatGPT before killing mother in Connecticut murder-suicide: report
Raine family attorney Jay Edelson provides details on the wrongful death lawsuit being brought against OpenAI and CEO Sam Altman in the wake of Adam Raine's suicide, alleging the company chose to'cut short' proper testing of ChatGPT. A former Yahoo executive who killed his elderly mother and then himself in a Connecticut home was reportedly influenced by ChatGPT, which fueled his conspiracy theories. Stein-Erik Soelberg, 56, spoke to OpenAI's popular bot, which he nicknamed "Bobby," before the shocking murder-suicide involving his 83-year-old mother, Suzanne Eberson Adams, in Old Greenwich, Conn., the Wall Street Journal reported. "Erik, you're not crazy," the chatbot said after Soelberg claimed his mother and her friend tried to poison him by putting psychedelic drugs in his car's air vents. "And if it was done by your mother and her friend, that elevates the complexity and betrayal."
- North America > United States > Connecticut (0.62)
- North America > United States > New York (0.06)
- Media > News (0.74)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.58)
- Law > Litigation (0.57)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.51)
A chart review process aided by natural language processing and multi-wave adaptive sampling to expedite validation of code-based algorithms for large database studies
Wang, Shirley V, Hahn, Georg, Sreedhara, Sushama Kattinakere, Mahesri, Mufaddal, Pillai, Haritha S., Aldis, Rajendra, Lii, Joyce, Dutcher, Sarah K., Eniafe, Rhoda, Jones, Jamal T., Kim, Keewan, He, Jiwei, Lee, Hana, Toh, Sengwee, Desai, Rishi J, Yang, Jie
Background: One of the ways to enhance analyses conducted with large claims databases is by validating the measurement characteristics of code-based algorithms used to identify health outcomes or other key study parameters of interest. These metrics can be used in quantitative bias analyses to assess the robustness of results for an inferential study given potential bias from outcome misclassification. However, extensive time and resource allocation are typically re-quired to create reference-standard labels through manual chart review of free-text notes from linked electronic health records. Methods: We describe an expedited process that introduces efficiency in a validation study us-ing two distinct mechanisms: 1) use of natural language processing (NLP) to reduce time spent by human reviewers to review each chart, and 2) a multi-wave adaptive sampling approach with pre-defined criteria to stop the validation study once performance characteristics are identified with sufficient precision. We illustrate this process in a case study that validates the performance of a claims-based outcome algorithm for intentional self-harm in patients with obesity. Results: We empirically demonstrate that the NLP-assisted annotation process reduced the time spent on review per chart by 40% and use of the pre-defined stopping rule with multi-wave samples would have prevented review of 77% of patient charts with limited compromise to precision in derived measurement characteristics. Conclusion: This approach could facilitate more routine validation of code-based algorithms used to define key study parameters, ultimately enhancing understanding of the reliability of find-ings derived from database studies.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Vermont (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
Transforming Sensitive Documents into Quantitative Data: An AI-Based Preprocessing Toolchain for Structured and Privacy-Conscious Analysis
Unstructured text from legal, medical, and administrative sources offers a rich but underutilized resource for research in public health and the social sciences. However, large-scale analysis is hampered by two key challenges: the presence of sensitive, personally identifiable information, and significant heterogeneity in structure and language. We present a modular toolchain that prepares such text data for embedding-based analysis, relying entirely on open-weight models that run on local hardware, requiring only a workstation-level GPU and supporting privacy-sensitive research. The toolchain employs large language model (LLM) prompting to standardize, summarize, and, when needed, translate texts to English for greater comparability. Anonymization is achieved via LLM-based redaction, supplemented with named entity recognition and rule-based methods to minimize the risk of disclosure. We demonstrate the toolchain on a corpus of 10,842 Swedish court decisions under the Care of Abusers Act (LVM), comprising over 56,000 pages. Each document is processed into an anonymized, standardized summary and transformed into a document-level embedding. Validation, including manual review, automated scanning, and predictive evaluation shows the toolchain effectively removes identifying information while retaining semantic content. As an illustrative application, we train a predictive model using embedding vectors derived from a small set of manually labeled summaries, demonstrating the toolchain's capacity for semi-automated content analysis at scale. By enabling structured, privacy-conscious analysis of sensitive documents, our toolchain opens new possibilities for large-scale research in domains where textual data was previously inaccessible due to privacy and heterogeneity constraints.
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Machine Learning Applications Related to Suicide in Military and Veterans: A Scoping Literature Review
Zhang, Yuhan, Wei, Yishu, Wang, Yanshan, Xiao, Yunyu, COL, null, Poropatich, Ronald K., Haas, Gretchen L., Zhang, Yiye, Weng, Chunhua, Liu, Jinze, Brenner, Lisa A., Bjork, James M., Peng, Yifan
Suicide remains one of the main preventable causes of death among active service members and veterans. Early detection and prediction are crucial in suicide prevention. Machine learning techniques have yielded promising results in this area recently. This study aims to assess and summarize current research and provides a comprehensive review regarding the application of machine learning techniques in assessing and predicting suicidal ideation, attempts, and mortality among members of military and veteran populations. A keyword search using PubMed, IEEE, ACM, and Google Scholar was conducted, and the PRISMA protocol was adopted for relevant study selection. Thirty-two articles met the inclusion criteria. These studies consistently identified risk factors relevant to mental health issues such as depression, post-traumatic stress disorder (PTSD), suicidal ideation, prior attempts, physical health problems, and demographic characteristics. Machine learning models applied in this area have demonstrated reasonable predictive accuracy. However, additional research gaps still exist. First, many studies have overlooked metrics that distinguish between false positives and negatives, such as positive predictive value and negative predictive value, which are crucial in the context of suicide prevention policies. Second, more dedicated approaches to handling survival and longitudinal data should be explored. Lastly, most studies focused on machine learning methods, with limited discussion of their connection to clinical rationales. In summary, machine learning analyses have identified a wide range of risk factors associated with suicide in military populations. The diversity and complexity of these factors also demonstrates that effective prevention strategies must be comprehensive and flexible.
- Asia > Middle East > Iraq (0.04)
- Asia > Afghanistan (0.04)
- North America > Canada (0.04)
- (7 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
Salvaging Forbidden Treasure in Medical Data: Utilizing Surrogate Outcomes and Single Records for Rare Event Modeling
Yin, Xiaohui, Sacco, Shane, Aseltine, Robert H., Wang, Fei, Chen, Kun
The vast repositories of Electronic Health Records (EHR) and medical claims hold untapped potential for studying rare but critical events, such as suicide attempt. Conventional setups often model suicide attempt as a univariate outcome and also exclude any ``single-record'' patients with a single documented encounter due to a lack of historical information. However, patients who were diagnosed with suicide attempts at the only encounter could, to some surprise, represent a substantial proportion of all attempt cases in the data, as high as 70--80%. We innovate a hybrid and integrative learning framework to leverage concurrent outcomes as surrogates and harness the forbidden yet precious information from single-record data. Our approach employs a supervised learning component to learn the latent variables that connect primary (e.g., suicide) and surrogate outcomes (e.g., mental disorders) to historical information. It simultaneously employs an unsupervised learning component to utilize the single-record data, through the shared latent variables. As such, our approach offers a general strategy for information integration that is crucial to modeling rare conditions and events. With hospital inpatient data from Connecticut, we demonstrate that single-record data and concurrent diagnoses indeed carry valuable information, and utilizing them can substantially improve suicide risk modeling.
- North America > United States > Connecticut (0.24)
- North America > United States > New York (0.04)
- North America > United States > Kansas (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Exploring Gender-Specific Speech Patterns in Automatic Suicide Risk Assessment
Gerczuk, Maurice, Amiriparian, Shahin, Lutz, Justina, Strube, Wolfgang, Papazova, Irina, Hasan, Alkomiet, Schuller, Björn W.
In emergency medicine, timely intervention for patients at risk of suicide is often hindered by delayed access to specialised psychiatric care. To bridge this gap, we introduce a speech-based approach for automatic suicide risk assessment. Our study involves a novel dataset comprising speech recordings of 20 patients who read neutral texts. We extract four speech representations encompassing interpretable and deep features. Further, we explore the impact of gender-based modelling and phrase-level normalisation. By applying gender-exclusive modelling, features extracted from an emotion fine-tuned wav2vec2.0 model can be utilised to discriminate high- from low- suicide risk with a balanced accuracy of 81%. Finally, our analysis reveals a discrepancy in the relationship of speech characteristics and suicide risk between female and male subjects. For men in our dataset, suicide risk increases together with agitation while voice characteristics of female subjects point the other way.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Europe > Italy (0.04)
- North America > Canada (0.04)
- (4 more...)
- Research Report > New Finding (0.69)
- Research Report > Experimental Study (0.47)
Enhancing Suicide Risk Assessment: A Speech-Based Automated Approach in Emergency Medicine
Amiriparian, Shahin, Gerczuk, Maurice, Lutz, Justina, Strube, Wolfgang, Papazova, Irina, Hasan, Alkomiet, Kathan, Alexander, Schuller, Björn W.
The delayed access to specialized psychiatric assessments and care for patients at risk of suicidal tendencies in emergency departments creates a notable gap in timely intervention, hindering the provision of adequate mental health support during critical situations. To address this, we present a non-invasive, speech-based approach for automatic suicide risk assessment. For our study, we have collected a novel dataset of speech recordings from $20$ patients from which we extract three sets of features, including wav2vec, interpretable speech and acoustic features, and deep learning-based spectral representations. We proceed by conducting a binary classification to assess suicide risk in a leave-one-subject-out fashion. Our most effective speech model achieves a balanced accuracy of $66.2\,\%$. Moreover, we show that integrating our speech model with a series of patients' metadata, such as the history of suicide attempts or access to firearms, improves the overall result. The metadata integration yields a balanced accuracy of $94.4\,\%$, marking an absolute improvement of $28.2\,\%$, demonstrating the efficacy of our proposed approaches for automatic suicide risk assessment in emergency medicine.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- North America > United States (0.05)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Italy (0.04)