CLARIFY: A Specialist-Generalist Framework for Accurate and Lightweight Dermatological Visual Question Answering
Saha, Aranya, Khan, Tanvir Ahmed, Swapnil, Ismam Nur, Haque, Mohammad Ariful
–arXiv.org Artificial Intelligence
--Vision-language models (VLMs) have shown significant potential for medical tasks; however, their general-purpose nature can limit specialized diagnostic accuracy, and their large size poses substantial inference costs for real-world clinical deployment. T o address these challenges, we introduce CLARIFY, a Specialist-Generalist framework for dermatological visual question answering (VQA). CLARIFY combines two components: (i) a lightweight, domain-trained image classifier (the Specialist) that provides fast and highly accurate diagnostic predictions, and (ii) a powerful yet compressed conversational VLM (the Generalist) that generates natural language explanations to user query. This synergy is further enhanced by a knowledge graph-based retrieval module, which grounds the Generalist's responses in factual dermatological knowledge, ensuring both accuracy and reliability. This hierarchical design not only reduces diagnostic errors but also significantly improves computational efficiency. Experiments on our curated multimodal dermatology dataset demonstrate that CLARIFY achieves an 18% improvement in diagnostic accuracy over the strongest baseline--a fine-tuned, uncompressed single-line VLM--while reducing the average VRAM requirement and latency by at least 20% and 5% respectively. These results indicate that a Specialist-Generalist system provides a practical and powerful paradigm for building lightweight, trustworthy, and clinically viable AI systems. ISION language models (VLMs) like LLaV A [1] and Qwen-VL [2] have demonstrated a remarkable ability to interpret and reason about joint visual and textual data [3]. Their potential in medicine is vast, with promising applications in tasks ranging from radiological report generation to comprehensive clinical decision support [4], [5]. However, translating this potential into reliable clinical tools faces some critical hurdles.
arXiv.org Artificial Intelligence
Aug-27-2025
- Country:
- Asia > Bangladesh
- Dhaka Division > Dhaka District > Dhaka (0.05)
- North America > Canada
- Asia > Bangladesh
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Health & Medicine
- Diagnostic Medicine (0.90)
- Therapeutic Area > Dermatology (1.00)
- Health & Medicine
- Technology: