Goto

Collaborating Authors

 human therapist


A Theoretical Framework of the Processes of Change in Psychotherapy Delivered by Artificial Agents

Herbener, Arthur Bran, Damholdt, Malene Flensborg

arXiv.org Artificial Intelligence

The question of whether artificial agents (e.g., chatbots and social robots) can replace human therapists has received notable attention following the recent launch of large language models. However, little is known about the processes of change in psychotherapy delivered by artificial agents. To facilitate hypothesis development and stimulate scientific debate, the present article offers the first theoretical framework of the processes of change in psychotherapy delivered by artificial agents. The theoretical framework rests upon a conceptual analysis of what active ingredients may be inherently linked to the presence of human therapists. We propose that human therapists' ontological status as human beings and sociocultural status as socially sanctioned healthcare professionals play crucial roles in promoting treatment outcomes. In the absence of the ontological and sociocultural status of human therapists, we propose what we coin the genuineness gap and credibility gap can emerge and undermine key processes of change in psychotherapy. Based on these propositions, we propose avenues for scientific investigations and practical applications aimed at leveraging the strengths of artificial agents and human therapists respectively. We also highlight the intricate agentic nature of artificial agents and discuss how this complicates endeavors to establish universally applicable propositions regarding the processes of change in these interventions.


People are using AI to 'sit' with them while they trip on psychedelics

MIT Technology Review

Peter--who asked to have his last name omitted from this story for privacy reasons--is far from alone. A growing number of people are using AI chatbots as "trip sitters"--a phrase that traditionally refers to a sober person tasked with monitoring someone who's under the influence of a psychedelic--and sharing their experiences online. It's a potent blend of two cultural trends: using AI for therapy and using psychedelics to alleviate mental-health problems. But this is a potentially dangerous psychological cocktail, according to experts. While it's far cheaper than in-person psychedelic therapy, it can go badly awry.


Do We Talk to Robots Like Therapists, and Do They Respond Accordingly? Language Alignment in AI Emotional Support

Chiang, Sophie, Laban, Guy, Gunes, Hatice

arXiv.org Artificial Intelligence

As conversational agents increasingly engage in emotionally supportive dialogue, it is important to understand how closely their interactions resemble those in traditional therapy settings. This study investigates whether the concerns shared with a robot align with those shared in human-to-human (H2H) therapy sessions, and whether robot responses semantically mirror those of human therapists. We analyzed two datasets: one of interactions between users and professional therapists (Hugging Face's NLP Mental Health Conversations), and another involving supportive conversations with a social robot (QTrobot from LuxAI) powered by a large language model (LLM, GPT-3.5). Using sentence embeddings and K-means clustering, we assessed cross-agent thematic alignment by applying a distance-based cluster-fitting method that evaluates whether responses from one agent type map to clusters derived from the other, and validated it using Euclidean distances. Results showed that 90.88% of robot conversation disclosures could be mapped to clusters from the human therapy dataset, suggesting shared topical structure. For matched clusters, we compared the subjects as well as therapist and robot responses using Transformer, Word2Vec, and BERT embeddings, revealing strong semantic overlap in subjects' disclosures in both datasets, as well as in the responses given to similar human disclosure themes across agent types (robot vs. human therapist). These findings highlight both the parallels and boundaries of robot-led support conversations and their potential for augmenting mental health interventions.


AI-Augmented LLMs Achieve Therapist-Level Responses in Motivational Interviewing

Huang, Yinghui, Jiang, Yuxuan, Liu, Hui, Cai, Yixin, Li, Weiqing, Hu, Xiangen

arXiv.org Artificial Intelligence

Large language models (LLMs) like GPT-4 show potential for scaling motivational interviewing (MI) in addiction care, but require systematic evaluation of therapeutic capabilities. We present a computational framework assessing user-perceived quality (UPQ) through expected and unexpected MI behaviors. Analyzing human therapist and GPT-4 MI sessions via human-AI collaboration, we developed predictive models integrating deep learning and explainable AI to identify 17 MI-consistent (MICO) and MI-inconsistent (MIIN) behavioral metrics. A customized chain-of-thought prompt improved GPT-4's MI performance, reducing inappropriate advice while enhancing reflections and empathy. Although GPT-4 remained marginally inferior to therapists overall, it demonstrated superior advice management capabilities. The model achieved measurable quality improvements through prompt engineering, yet showed limitations in addressing complex emotional nuances. This framework establishes a pathway for optimizing LLM-based therapeutic tools through targeted behavioral metric analysis and human-AI co-evaluation. Findings highlight both the scalability potential and current constraints of LLMs in clinical communication applications.


"It Listens Better Than My Therapist": Exploring Social Media Discourse on LLMs as Mental Health Tool

Haensch, Anna-Carolina

arXiv.org Artificial Intelligence

Abstract: The emergence of generative AI chatbots such as ChatGPT has prompted growing public and academic interest in their role as informal mental health support tools. While early rule-based systems have been around since several years, large language models (LLMs) offer new capabilities in conversational fluency, empathy simulation, and availability. This study explores how users engage with LLMs as mental health tools by analyzing over 10,000 TikTok comments from videos referencing LLMs as mental health tools. Using a self-developed tiered coding schema and supervised classification models, we identify user experiences, attitudes, and recurring themes. Results show that nearly 20% of comments reflect personal use, with these users expressing overwhelmingly positive attitudes. Commonly cited benefits include accessibility, emotional support, and perceived therapeutic value. However, concerns around privacy, generic responses, and the lack of professional oversight remain prominent. It Is important to note that the user feedback does not indicate which therapeutic framework, if any, the LLM-generated output aligns with. While the findings underscore the growing relevance of AI in everyday practices, they also highlight the urgent need for clinical and ethical scrutiny in the use of AI for mental health support. This study does not endorse or encourage the use of AI tools as substitutes for professional mental health support.


Psy-Copilot: Visual Chain of Thought for Counseling

Chen, Keqi, Sun, Zekai, Lian, Huijun, Gao, Yingming, Li, Ya

arXiv.org Artificial Intelligence

Large language models (LLMs) are becoming increasingly popular in the field of psychological counseling. However, when human therapists work with LLMs in therapy sessions, it is hard to understand how the model gives the answers. To address this, we have constructed Psy-COT, a graph designed to visualize the thought processes of LLMs during therapy sessions. The Psy-COT graph presents semi-structured counseling conversations alongside step-by-step annotations that capture the reasoning and insights of therapists. Moreover, we have developed Psy-Copilot, which is a conversational AI assistant designed to assist human psychological therapists in their consultations. It can offer traceable psycho-information based on retrieval, including response candidates, similar dialogue sessions, related strategies, and visual traces of results. We have also built an interactive platform for AI-assisted counseling. It has an interface that displays the relevant parts of the retrieval sub-graph. The Psy-Copilot is designed not to replace psychotherapists but to foster collaboration between AI and human therapists, thereby promoting mental health development. Our code and demo are both open-sourced and available for use.


AI chatbots posing as therapists could have 'dangerous' and violent consequences for patients, experts say

FOX News

Call the 988 Suicide and Crisis Lifeline or text TALK to 741741 at the Crisis Text Line if you are in need of help. Health experts say that artificial intelligence (AI) chatbots posing as therapists could cause "serious harm" to struggling people, including adolescents, without the proper safety measures. Christine Yu Moutier, M.D., Chief Medical Officer at the American Foundation for Suicide Prevention, told Fox News Digital there are "critical gaps" in research regarding the intended and unintended impacts of AI on suicide risk, mental health and larger human behavior. "The problem with these AI chatbots is that they were not designed with expertise on suicide risk and prevention baked into the algorithms. Additionally, there is no helpline available on the platform for users who may be at risk of a mental health condition or suicide, no training on how to use the tool if you are at risk, nor industry standards to regulate these technologies," Moutier said.


Enhancing Mental Health Support through Human-AI Collaboration: Toward Secure and Empathetic AI-enabled chatbots

AlMakinah, Rawan, Norcini-Pala, Andrea, Disney, Lindsey, Canbaz, M. Abdullah

arXiv.org Artificial Intelligence

Access to mental health support remains limited, particularly in marginalized communities where structural and cultural barriers hinder timely care. This paper explores the potential of AI-enabled chatbots as a scalable solution, focusing on advanced large language models (LLMs)-GPT v4, Mistral Large, and LLama V3.1-and assessing their ability to deliver empathetic, meaningful responses in mental health contexts. While these models show promise in generating structured responses, they fall short in replicating the emotional depth and adaptability of human therapists. Additionally, trustworthiness, bias, and privacy challenges persist due to unreliable datasets and limited collaboration with mental health professionals. To address these limitations, we propose a federated learning framework that ensures data privacy, reduces bias, and integrates continuous validation from clinicians to enhance response quality. This approach aims to develop a secure, evidence-based AI chatbot capable of offering trustworthy, empathetic, and bias-reduced mental health support, advancing AI's role in digital mental health care.


Towards a Client-Centered Assessment of LLM Therapists by Client Simulation

Wang, Jiashuo, Xiao, Yang, Li, Yanran, Song, Changhe, Xu, Chunpu, Tan, Chenhao, Li, Wenjie

arXiv.org Artificial Intelligence

Although there is a growing belief that LLMs can be used as therapists, exploring LLMs' capabilities and inefficacy, particularly from the client's perspective, is limited. This work focuses on a client-centered assessment of LLM therapists with the involvement of simulated clients, a standard approach in clinical medical education. However, there are two challenges when applying the approach to assess LLM therapists at scale. Ethically, asking humans to frequently mimic clients and exposing them to potentially harmful LLM outputs can be risky and unsafe. Technically, it can be difficult to consistently compare the performances of different LLM therapists interacting with the same client. To this end, we adopt LLMs to simulate clients and propose ClientCAST, a client-centered approach to assessing LLM therapists by client simulation. Specifically, the simulated client is utilized to interact with LLM therapists and complete questionnaires related to the interaction. Based on the questionnaire results, we assess LLM therapists from three client-centered aspects: session outcome, therapeutic alliance, and self-reported feelings. We conduct experiments to examine the reliability of ClientCAST and use it to evaluate LLMs therapists implemented by Claude-3, GPT-3.5, LLaMA3-70B, and Mixtral 8*7B. Codes are released at https://github.com/wangjs9/ClientCAST.


Evaluating and Personalizing User-Perceived Quality of Text-to-Speech Voices for Delivering Mindfulness Meditation with Different Physical Embodiments

Shi, Zhonghao, Chen, Han, Velentza, Anna-Maria, Liu, Siqi, Dennler, Nathaniel, O'Connell, Allison, Matarić, Maja

arXiv.org Artificial Intelligence

Mindfulness-based therapies have been shown to be effective in improving mental health, and technology-based methods have the potential to expand the accessibility of these therapies. To enable real-time personalized content generation for mindfulness practice in these methods, high-quality computer-synthesized text-to-speech (TTS) voices are needed to provide verbal guidance and respond to user performance and preferences. However, the user-perceived quality of state-of-the-art TTS voices has not yet been evaluated for administering mindfulness meditation, which requires emotional expressiveness. In addition, work has not yet been done to study the effect of physical embodiment and personalization on the user-perceived quality of TTS voices for mindfulness. To that end, we designed a two-phase human subject study. In Phase 1, an online Mechanical Turk between-subject study (N=471) evaluated 3 (feminine, masculine, child-like) state-of-the-art TTS voices with 2 (feminine, masculine) human therapists' voices in 3 different physical embodiment settings (no agent, conversational agent, socially assistive robot) with remote participants. Building on findings from Phase 1, in Phase 2, an in-person within-subject study (N=94), we used a novel framework we developed for personalizing TTS voices based on user preferences, and evaluated user-perceived quality compared to best-rated non-personalized voices from Phase 1. We found that the best-rated human voice was perceived better than all TTS voices; the emotional expressiveness and naturalness of TTS voices were poorly rated, while users were satisfied with the clarity of TTS voices. Surprisingly, by allowing users to fine-tune TTS voice features, the user-personalized TTS voices could perform almost as well as human voices, suggesting user personalization could be a simple and very effective tool to improve user-perceived quality of TTS voice.