S-Chain: Structured Visual Chain-of-Thought For Medicine
Le-Duc, Khai, Nguyen, Duy M. H., Trinh, Phuong T. H., Nguyen, Tien-Phat, Diep, Nghiem T., Ngo, An, Vu, Tung, Vuong, Trinh, Nguyen, Anh-Tien, Nguyen, Mau, Hoang, Van Trung, Nguyen, Khai-Nguyen, Nguyen, Hy, Ngo, Chris, Liu, Anji, Ho, Nhat, Hauschild, Anne-Christin, Nguyen, Khanh Xuan, Nguyen-Tang, Thanh, Xie, Pengtao, Sonntag, Daniel, Zou, James, Niepert, Mathias, Nguyen, Anh Totti
–arXiv.org Artificial Intelligence
Faithful reasoning in medical vision-language models (VLMs) requires not only accurate predictions but also transparent alignment between textual rationales and visual evidence. While Chain-of-Thought (CoT) prompting has shown promise in medical visual question answering (VQA), no large-scale expert-level dataset has captured stepwise reasoning with precise visual grounding. We introduce S-Chain, the first large-scale dataset of 12,000 expert-annotated medical images with bounding boxes and structured visual CoT (SV-CoT), explicitly linking visual regions to reasoning steps. The dataset further supports 16 languages, totaling over 700k VQA pairs for broad multilingual applicability. Using S-Chain, we benchmark state-of-the-art medical VLMs (ExGra-Med, LLaVA-Med) and general-purpose VLMs (Qwen2.5-VL, InternVL2.5), showing that SV-CoT supervision significantly improves interpretability, grounding fidelity, and robustness. Beyond benchmarking, we study its synergy with retrieval-augmented generation, revealing how domain knowledge and visual grounding interact during autoregressive reasoning. Finally, we propose a new mechanism that strengthens the alignment between visual evidence and reasoning, improving both reliability and efficiency. S-Chain establishes a new benchmark for grounded medical reasoning and paves the way toward more trustworthy and explainable medical VLMs.
arXiv.org Artificial Intelligence
Oct-28-2025
- Country:
- Asia
- Japan (0.04)
- Middle East > UAE (0.04)
- Singapore (0.04)
- South Korea (0.04)
- Vietnam (0.04)
- Europe > Germany
- Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- North America
- Canada
- United States
- California
- Alameda County > Berkeley (0.04)
- San Diego County > San Diego (0.04)
- New Jersey (0.04)
- Texas > Travis County
- Austin (0.04)
- California
- Oceania > Australia (0.04)
- South America > Brazil (0.04)
- Asia
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Health & Medicine
- Diagnostic Medicine > Imaging (1.00)
- Health Care Technology (1.00)
- Nuclear Medicine (0.93)
- Pharmaceuticals & Biotechnology (1.00)
- Therapeutic Area > Neurology
- Alzheimer's Disease (1.00)
- Health & Medicine
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Natural Language
- Chatbot (1.00)
- Large Language Model (1.00)
- Representation & Reasoning (1.00)
- Vision (1.00)
- Machine Learning > Neural Networks
- Information Technology > Artificial Intelligence