smoking
Can we use LLMs to bootstrap reinforcement learning? -- A case study in digital health behavior change
Albers, Nele, de Groot, Esra Cemre Su, Keijsers, Loes, Hillegers, Manon H., Krahmer, Emiel
Personalizing digital applications for health behavior change is a promising route to making them more engaging and effective. This especially holds for approaches that adapt to users and their specific states (e.g., motivation, knowledge, wants) over time. However, developing such approaches requires making many design choices, whose effectiveness is difficult to predict from literature and costly to evaluate in practice. In this work, we explore whether large language models (LLMs) can be used out-of-the-box to generate samples of user interactions that provide useful information for training reinforcement learning models for digital behavior change settings. Using real user data from four large behavior change studies as comparison, we show that LLM-generated samples can be useful in the absence of real data. Comparisons to the samples provided by human raters further show that LLM-generated samples reach the performance of human raters. Additional analyses of different prompting strategies including shorter and longer prompt variants, chain-of-thought prompting, and few-shot prompting show that the relative effectiveness of different strategies depends on both the study and the LLM with also relatively large differences between prompt paraphrases alone. We provide recommendations for how LLM-generated samples can be useful in practice.
- Europe > Austria > Vienna (0.14)
- Europe > Netherlands > South Holland > Rotterdam (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (11 more...)
ChatThero: An LLM-Supported Chatbot for Behavior Change and Therapeutic Support in Addiction Recovery
Wang, Junda, Yao, Zonghai, Li, Lingxi, Qian, Junhui, Yang, Zhichao, Yu, Hong
Substance use disorders (SUDs) affect millions of people, and relapses are common, requiring multi-session treatments. Access to care is limited, which contributes to the challenge of recovery support. We present \textbf{ChatThero}, an innovative low-cost, multi-session, stressor-aware, and memory-persistent autonomous \emph{language agent} designed to facilitate long-term behavior change and therapeutic support in addiction recovery. Unlike existing work that mostly finetuned large language models (LLMs) on patient-therapist conversation data, ChatThero was trained in a multi-agent simulated environment that mirrors real therapy. We created anonymized patient profiles from recovery communities (e.g., Reddit). We classify patients as \texttt{easy}, \texttt{medium}, and \texttt{difficult}, three scales representing their resistance to recovery. We created an external environment by introducing stressors (e.g., social determinants of health) to simulate real-world situations. We dynamically inject clinically-grounded therapeutic strategies (motivational interview and cognitive behavioral therapy). Our evaluation, conducted by both human (blinded clinicians) and LLM-as-Judge, shows that ChatThero is superior in empathy and clinical relevance. We show that stressor simulation improves robustness of ChatThero. Explicit stressors increase relapse-like setbacks, matching real-world patterns. We evaluate ChatThero with behavioral change metrics. On a 1--5 scale, ChatThero raises \texttt{motivation} by $+1.71$ points (from $2.39$ to $4.10$) and \texttt{confidence} by $+1.67$ points (from $1.52$ to $3.19$), substantially outperforming GPT-5. On \texttt{difficult} patients, ChatThero reaches the success milestone with $26\%$ fewer turns than GPT-5.
- Europe > France (0.04)
- North America > United States > Massachusetts > Middlesex County > Lowell (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
- Health & Medicine > Consumer Health (1.00)
- Government > Regional Government > North America Government > United States Government (0.67)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.66)
Evaluation of Causal Reasoning for Large Language Models in Contextualized Clinical Scenarios of Laboratory Test Interpretation
Bhasuran, Balu, Prosperi, Mattia, Hanna, Karim, Petrilli, John, Washington, Caretia JeLayne, He, Zhe
This study evaluates causal reasoning in large language models (LLMs) using 99 clinically grounded laboratory test scenarios aligned with Pearl's Ladder of Causation: association, intervention, and counterfactual reasoning. We examined common laboratory tests such as hemoglobin A1c, creatinine, and vitamin D, and paired them with relevant causal factors including age, gender, obesity, and smoking. Two LLMs - GPT-o1 and Llama-3.2-8b-instruct - were tested, with responses evaluated by four medically trained human experts. GPT-o1 demonstrated stronger discriminative performance (AUROC overall = 0.80 +/- 0.12) compared to Llama-3.2-8b-instruct (0.73 +/- 0.15), with higher scores across association (0.75 vs 0.72), intervention (0.84 vs 0.70), and counterfactual reasoning (0.84 vs 0.69). Sensitivity (0.90 vs 0.84) and specificity (0.93 vs 0.80) were also greater for GPT-o1, with reasoning ratings showing similar trends. Both models performed best on intervention questions and worst on counterfactuals, particularly in altered outcome scenarios. These findings suggest GPT-o1 provides more consistent causal reasoning, but refinement is required before adoption in high-stakes clinical applications.
- North America > United States > Florida > Hillsborough County > Tampa (0.14)
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- North America > United States > Florida > Leon County > Tallahassee (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
- Health & Medicine > Diagnostic Medicine > Lab Test (1.00)
- Health & Medicine > Consumer Health (1.00)
- Europe > United Kingdom (0.46)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.68)
- Government (0.94)
- Health & Medicine > Consumer Health (0.46)
A Fully Generative Motivational Interviewing Counsellor Chatbot for Moving Smokers Towards the Decision to Quit
Mahmood, Zafarullah, Ali, Soliman, Zhu, Jiading, Abdelwahab, Mohamed, Collins, Michelle Yu, Chen, Sihan, Zhao, Yi Cheng, Wolff, Jodi, Melamed, Osnat, Minian, Nadia, Maslej, Marta, Cooper, Carolynne, Ratto, Matt, Selby, Peter, Rose, Jonathan
The conversational capabilities of Large Language Models (LLMs) suggest that they may be able to perform as automated talk therapists. It is crucial to know if these systems would be effective and adhere to known standards. We present a counsellor chatbot that focuses on motivating tobacco smokers to quit smoking. It uses a state-of-the-art LLM and a widely applied therapeutic approach called Motivational Interviewing (MI), and was evolved in collaboration with clinician-scientists with expertise in MI. We also describe and validate an automated assessment of both the chatbot's adherence to MI and client responses. The chatbot was tested on 106 participants, and their confidence that they could succeed in quitting smoking was measured before the conversation and one week later. Participants' confidence increased by an average of 1.7 on a 0-10 scale. The automated assessment of the chatbot showed adherence to MI standards in 98% of utterances, higher than human counsellors. The chatbot scored well on a participant-reported metric of perceived empathy but lower than typical human counsellors. Furthermore, participants' language indicated a good level of motivation to change, a key goal in MI. These results suggest that the automation of talk therapy with a modern LLM has promise.
- North America > Canada > Ontario > Toronto (0.14)
- North America > Canada > Alberta (0.14)
- Europe > United Kingdom (0.04)
- (17 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Personal > Interview (0.93)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
- Health & Medicine > Consumer Health (1.00)
Utilizing LLMs to Investigate the Disputed Role of Evidence in Electronic Cigarette Health Policy Formation in Australia and the UK
Curran, Damian, Chapman, Brian, Conway, Mike
Australia and the UK have developed contrasting approaches to the regulation of electronic cigarettes, with - broadly speaking - Australia adopting a relatively restrictive approach and the UK adopting a more permissive approach. Notably, these divergent policies were developed from the same broad evidence base. In this paper, to investigate differences in how the two jurisdictions manage and present evidence, we developed and evaluated a Large Language Model-based sentence classifier to perform automated analyses of electronic cigarette-related policy documents drawn from official Australian and UK legislative processes (109 documents in total). Specifically, we utilized GPT-4 to automatically classify sentences based on whether they contained claims that e-cigarettes were broadly helpful or harmful for public health. Our LLM-based classifier achieved an F-score of 0.9. Further, when applying the classifier to our entire sentence-level corpus, we found that Australian legislative documents show a much higher proportion of harmful statements, and a lower proportion of helpful statements compared to the expected values, with the opposite holding for the UK. In conclusion, this work utilized an LLM-based approach to provide evidence to support the contention that - drawing on the same evidence base - Australian ENDS-related policy documents emphasize the harms associated with ENDS products and UK policy documents emphasize the benefits. Further, our approach provides a starting point for using LLM-based methods to investigate the complex relationship between evidence and health policy formation.
- Oceania > New Zealand (0.05)
- Oceania > Australia > Victoria > Melbourne (0.05)
- North America > United States > Virginia (0.05)
- (4 more...)
'MythBusters' star Adam Savage explores longevity and life hacks: 'There's no magic secret'
Tested's Adam Savage paired up with Medtronic to offer his commentary on what can contribute to a longer lifespan, including possible differences between men's and women's health. Former "MythBusters" star Adam Savage is exploring the science of longevity, asking how lifestyle choices, stress and even sleep affect how long we live. Savage, now a YouTube creator and head of the channel Tested, has partnered with health technology company Medtronic to engage in discussions about longevity. While not a researcher himself, he has taken a deep dive into scientific insights from experts and reflected on his own experiences. "Longevity has always been a fascination for me," Savage told Fox News Digital in an exclusive interview.
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Therapeutic Area (0.76)
AI-powered 'Death Clock' predicts how and when you'll die, down to the second... so how long do YOU have left?
If you could find out exactly how and when you'll die, would you want to know? A new AI-powered death clock claims to be able to do just that, predicting the method and age at which you will die, right down to the second. The free website, called the Death Clock, uses AI to analyze age, weight, and general outlook on life to'accurately' predict how long you have left to live. It also asks users to input information on lifestyle habits like drinking, smoking, diet, and exercise. Users can also reveal their alleged cause of death and see how their life expectancy compares to other people of the same sex and body mass index (BMI).
Can Explainable AI Assess Personalized Health Risks from Indoor Air Pollution?
Sarkar, Pritisha, Jala, Kushalava reddy, Saha, Mousumi
Acknowledging the effects of outdoor air pollution, the literature inadequately addresses indoor air pollution's impacts. Despite daily health risks, existing research primarily focused on monitoring, lacking accuracy in pinpointing indoor pollution sources. In our research work, we thoroughly investigated the influence of indoor activities on pollution levels. A survey of 143 participants revealed limited awareness of indoor air pollution. Leveraging 65 days of diverse data encompassing activities like incense stick usage, indoor smoking, inadequately ventilated cooking, excessive AC usage, and accidental paper burning, we developed a comprehensive monitoring system. We identify pollutant sources and effects with high precision through clustering analysis and interpretability models (LIME and SHAP). Our method integrates Decision Trees, Random Forest, Naive Bayes, and SVM models, excelling at 99.8% accuracy with Decision Trees. Continuous 24-hour data allows personalized assessments for targeted pollution reduction strategies, achieving 91% accuracy in predicting activities and pollution exposure.
- North America > United States > New York > New York County > New York City (0.05)
- Europe > Central Europe (0.04)
- Asia > India > West Bengal (0.04)
- Overview (1.00)
- Research Report > New Finding (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)