Evaluating AI Counseling in Japanese: Counselor, Client, and Evaluator Roles Assessed by Motivational Interviewing Criteria
Kiuchi, Keita, Fujimoto, Yoshikazu, Goto, Hideyuki, Hosokawa, Tomonori, Nishimura, Makoto, Sato, Yosuke, Sezai, Izumi
–arXiv.org Artificial Intelligence
This study provides the first comprehensive evaluation of large language model (LLM) performance across three counseling roles in Japanese-language therapeutic contexts. We simultaneously assessed counselor artificial intelligence (AI) systems (GPT-4-turbo with zeroshot prompting or Structured Multi-step Dialogue Prompts (SMDP), Claude-3-Opus-SMDP), client AI simulations, and evaluation AI systems (o3, Claude-3.7-Sonnet, Gemini-2.5-pro). Human experts (n = 15) with extensive counseling experience evaluated AI-generated dialogues using the Motivational Interviewing Treatment Integrity (MITI) Coding Manual 4.2.1. Notably, SMDP implementation significantly enhanced counselor AI performance across all MITI global ratings compared with zeroshot prompting, with no significant differences between GPT-SMDP and Opus-SMDP. Evaluation AIs showed comparable performance to human raters for Cultivating Change Talk but systematically overestimated Softening Sustain Talk and the overall quality metrics. Model-specific biases emerged: Gemini emphasized power-sharing, o3 focused on technical proficiency, and Sonnet prioritized emotional expression. Client AI simulations exhibited a limited emotional range and unnaturally high compliance, indicating the need for enhanced realism. These findings establish benchmarks for AI-assisted counseling in non-English contexts and identify critical areas for improvement through advanced prompt engineering, retrieval-augmented generation, and targeted fine-tuning, with important implications for developing culturally sensitive AI mental health tools.
arXiv.org Artificial Intelligence
Jul-9-2025
- Country:
- Asia
- Indonesia > Bali (0.04)
- Japan > Honshū
- Chūgoku > Okayama Prefecture
- Okayama (0.04)
- Kansai > Osaka Prefecture
- Osaka (0.04)
- Kantō
- Kanagawa Prefecture (0.04)
- Saitama Prefecture > Saitama (0.04)
- Tokyo Metropolis Prefecture > Tokyo (0.04)
- Chūgoku > Okayama Prefecture
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Singapore (0.04)
- North America > United States
- Florida > Miami-Dade County
- Miami (0.04)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- Florida > Miami-Dade County
- South America > Chile
- Asia
- Genre:
- Personal > Interview (0.93)
- Research Report
- Experimental Study > Negative Result (0.48)
- New Finding (1.00)
- Industry:
- Education (1.00)
- Health & Medicine > Therapeutic Area
- Psychiatry/Psychology > Mental Health (1.00)
- Information Technology (0.92)
- Technology: