AITopics | Diagnosis

Collaborating Authors

Diagnosis

News Overviews Instructional Materials AI-Alerts Classics

Architecting Clinical Collaboration: Multi-Agent Reasoning Systems for Multimodal Medical VQA

Thakrar, Karishma, Basavatia, Shreyas, Daftardar, Akshay

arXiv.org Artificial IntelligenceAug-27-2025

--Dermatological care via telemedicine often lacks the rich context of in-person visits. Clinicians must make diagnoses based on a handful of images and brief descriptions, without the benefit of physical exams, second opinions, or reference materials. While many medical AI systems attempt to bridge these gaps with domain-specific fine-tuning, this work hypothesized that mimicking clinical reasoning processes could offer a more effective path forward. This study tested seven vision-language models on medical visual question answering across six configurations: baseline models, fine-tuned variants, and both augmented with either reasoning layers that combine multiple model perspectives, analogous to peer consultation, or retrieval-augmented generation that incorporates medical literature at inference time, serving a role similar to reference-checking. While fine-tuning degraded performance in four of seven models with an average 30% decrease, baseline models collapsed on test data. Clinical-inspired architectures, meanwhile, achieved up to 70% accuracy, maintaining performance on unseen data while generating explainable, literature-grounded outputs critical for clinical adoption. These findings demonstrate that medical AI succeeds by reconstructing the collaborative and evidence-based practices fundamental to clinical diagnosis. Fine-tuning large models on medical data, the standard approach to medical AI, assumes domain exposure produces clinical competence [1]. Y et dermatology models show 15% performance drops in real-world settings [2], and catastrophic forgetting causes models to generate outputs exclusively from their training data [3]. This brittleness suggests a fundamental mismatch between current approaches and clinical reasoning. Additionally, physician groups achieve 85.6% diagnostic accuracy versus 62.5% for individuals [4], as collaboration reduces cognitive load and bias [5]. However, logistical constraints force physicians to work alone, a problem telemedicine intensifies by eliminating physical exams, peer consultation, and immediate reference access [6].

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2507.0552

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Health Care Technology > Telehealth (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Dermatology (0.89)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
(2 more...)

Add feedback

An Agentic System for Rare Disease Diagnosis with Traceable Reasoning

Zhao, Weike, Wu, Chaoyi, Fan, Yanjie, Zhang, Xiaoman, Qiu, Pengcheng, Sun, Yuze, Zhou, Xiao, Wang, Yanfeng, Sun, Xin, Zhang, Ya, Yu, Yongguo, Sun, Kun, Xie, Weidi

arXiv.org Artificial IntelligenceAug-27-2025

Rare diseases collectively affect over 300 million individuals worldwide, yet timely and accurate diagnosis remains a pervasive challenge. This is largely due to their clinical heterogeneity, low individual prevalence, and the limited familiarity most clinicians have with rare conditions. Here, we introduce DeepRare, the first rare disease diagnosis agentic system powered by a large language model (LLM), capable of processing heterogeneous clinical inputs. The system generates ranked diagnostic hypotheses for rare diseases, each accompanied by a transparent chain of reasoning that links intermediate analytic steps to verifiable medical evidence. DeepRare comprises three key components: a central host with a long-term memory module; specialized agent servers responsible for domain-specific analytical tasks integrating over 40 specialized tools and web-scale, up-to-date medical knowledge sources, ensuring access to the most current clinical information. This modular and scalable design enables complex diagnostic reasoning while maintaining traceability and adaptability. We evaluate DeepRare on eight datasets. The system demonstrates exceptional diagnostic performance among 2,919 diseases, achieving 100% accuracy for 1013 diseases. In HPO-based evaluations, DeepRare significantly outperforms other 15 methods, like traditional bioinformatics diagnostic tools, LLMs, and other agentic systems, achieving an average Recall@1 score of 57.18% and surpassing the second-best method (Reasoning LLM) by a substantial margin of 23.79 percentage points. For multi-modal input scenarios, DeepRare achieves 70.60% at Recall@1 compared to Exomiser's 53.20% in 109 cases. Manual verification of reasoning chains by clinical experts achieves 95.40% agreements. Furthermore, the DeepRare system has been implemented as a user-friendly web application http://raredx.cn/doctor.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.2043

Country:

Asia (0.46)
North America > United States > Massachusetts (0.28)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Genetic Disease (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Health Care Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning

Zheng, Qiaoyu, Sun, Yuze, Wu, Chaoyi, Zhao, Weike, Qiu, Pengcheng, Yu, Yongguo, Sun, Kun, Wang, Yanfeng, Zhang, Ya, Xie, Weidi

arXiv.org Artificial IntelligenceAug-22-2025

Accurate diagnosis with medical large language models is hindered by knowledge gaps and hallucinations. Retrieval and tool-augmented methods help, but their impact is limited by weak use of external knowledge and poor feedback-reasoning traceability. To address these challenges, We introduce Deep-DxSearch, an agentic RAG system trained end-to-end with reinforcement learning (RL) that enables steer tracebale retrieval-augmented reasoning for medical diagnosis. In Deep-DxSearch, we first construct a large-scale medical retrieval corpus comprising patient records and reliable medical knowledge sources to support retrieval-aware reasoning across diagnostic scenarios. More crutially, we frame the LLM as the core agent and the retrieval corpus as its environment, using tailored rewards on format, retrieval, reasoning structure, and diagnostic accuracy, thereby evolving the agentic RAG policy from large-scale data through RL. Experiments demonstrate that our end-to-end agentic RL training framework consistently outperforms prompt-engineering and training-free RAG approaches across multiple data centers. After training, Deep-DxSearch achieves substantial gains in diagnostic accuracy, surpassing strong diagnostic baselines such as GPT-4o, DeepSeek-R1, and other medical-specific frameworks for both common and rare disease diagnosis under in-distribution and out-of-distribution settings. Moreover, ablation studies on reward design and retrieval corpus components confirm their critical roles, underscoring the uniqueness and effectiveness of our approach compared with traditional implementations. Finally, case studies and interpretability analyses highlight improvements in Deep-DxSearch's diagnostic policy, providing deeper insight into its performance gains and supporting clinicians in delivering more reliable and precise preliminary diagnoses. See https://github.com/MAGIC-AI4Med/Deep-DxSearch.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.15746

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Root Cause Analysis of Hydrogen Bond Separation in Spatio-Temporal Molecular Dynamics using Causal Models

Adesunkanmi, Rahmat K., Khokhar, Ashfaq, Trajcevski, Goce, Murad, Sohail

arXiv.org Artificial IntelligenceAug-19-2025

Molecular dynamics simulations (MDS) face challenges, including resource-heavy computations and the need to manually scan outputs to detect "interesting events," such as the formation and persistence of hydrogen bonds between atoms of different molecules. A critical research gap lies in identifying the underlying causes of hydrogen bond formation and separation -understanding which interactions or prior events contribute to their emergence over time. With this challenge in mind, we propose leveraging spatio-temporal data analytics and machine learning models to enhance the detection of these phenomena. In this paper, our approach is inspired by causal modeling and aims to identify the root cause variables of hydrogen bond formation and separation events. Specifically, we treat the separation of hydrogen bonds as an "intervention" occurring and represent the causal structure of the bonding and separation events in the MDS as graphical causal models. These causal models are built using a variational autoencoder-inspired architecture that enables us to infer causal relationships across samples with diverse underlying causal graphs while leveraging shared dynamic information. We further include a step to infer the root causes of changes in the joint distribution of the causal models. By constructing causal models that capture shifts in the conditional distributions of molecular interactions during bond formation or separation, this framework provides a novel perspective on root cause analysis in molecular dynamic systems. We validate the efficacy of our model empirically on the atomic trajectories that used MDS for chiral separation, demonstrating that we can predict many steps in the future and also find the variables driving the observed changes in the system.

artificial intelligence, machine learning, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2508.125

Country:

Europe (0.67)
North America > United States > Iowa (0.15)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Artificial Intelligence in Rural Healthcare Delivery: Bridging Gaps and Enhancing Equity through Innovation

Balakrishnan, Kiruthika, Velusamy, Durgadevi, Hinkle, Hana E., Li, Zhi, Ramasamy, Karthikeyan, Khan, Hikmat, Ramaswamy, Srini, Shah, Pir Masoom

arXiv.org Artificial IntelligenceAug-19-2025

Rural healthcare faces persistent challenges, including inadequate infrastructure, workforce shortages, and socioeconomic disparities that hinder access to essential services. This study investigates the transformative potential of artificial intelligence (AI) in addressing these issues in underserved rural areas. We systematically reviewed 109 studies published between 2019 and 2024 from PubMed, Embase, Web of Science, IEEE Xplore, and Scopus. Articles were screened using PRISMA guidelines and Covidence software. A thematic analysis was conducted to identify key patterns and insights regarding AI implementation in rural healthcare delivery. The findings reveal significant promise for AI applications, such as predictive analytics, telemedicine platforms, and automated diagnostic tools, in improving healthcare accessibility, quality, and efficiency. Among these, advanced AI systems, including Multimodal Foundation Models (MFMs) and Large Language Models (LLMs), offer particularly transformative potential. MFMs integrate diverse data sources, such as imaging, clinical records, and bio signals, to support comprehensive decision-making, while LLMs facilitate clinical documentation, patient triage, translation, and virtual assistance. Together, these technologies can revolutionize rural healthcare by augmenting human capacity, reducing diagnostic delays, and democratizing access to expertise. However, barriers remain, including infrastructural limitations, data quality concerns, and ethical considerations. Addressing these challenges requires interdisciplinary collaboration, investment in digital infrastructure, and the development of regulatory frameworks. This review offers actionable recommendations and highlights areas for future research to ensure equitable and sustainable integration of AI in rural healthcare systems.

data mining, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.11738

Country:

North America > United States (1.00)
Asia (1.00)
Africa (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength Medium (0.93)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Health Care Technology > Telehealth (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

On Computing Probabilistic Explanations for Decision Trees Marcelo Arenas

Neural Information Processing SystemsAug-18-2025, 04:31:58 GMT

Formal XAI (explainable AI) is a growing area that focuses on computing explanations with mathematical guarantees for the decisions made by ML models.

artificial intelligence, machine learning, sufficient reason, (18 more...)

Neural Information Processing Systems

Country:

South America > Chile (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)

Add feedback

Risk-Based Prognostics and Health Management

Sheppard, John W.

arXiv.org Artificial IntelligenceAug-18-2025

Introduction As engineering fields mature, new technologies are emerging that are beginning to serve as the foundation of many societal improvements. For example, modern medical diagnostic equipment provides valuable information that gives medical professionals a better understanding of a patient's needs and ultimately improves quality of life [1]. Improvements to vehicle designs make transportation in cars or aircraft safer and more environmentally friendly [2]. Military equipment continues to be developed that better supports and protects personnel in the field [3]. Manufacturing practices and robotic equipment improve work safety conditions and reduce a product's price point, making amenities available to a wider range of consumers [4]. One approach to maximizing system availability is to incorporate some means of health assessment into the system itself. Doing so is often referred to as "integrated system health management" (ISHM) or "prognostics and health management" (PHM), which has been applied successfully to many complex systems [5]. By integrating health assessment into the very functioning of a system, more information can be obtained that provides a better understanding of the system as a whole, thus allowing system owners to become proactive in how they deal with system degradation. ISHM and PHM promise to focus on system conditions, thus supporting initiatives in what has become known as condition-based maintenance (CBM). This, in turn, enables maintenance events to be initiated based on specific system conditions rather than waiting until a failure occurs [6]. One of the key ingredients of ISHM/PHM is diagnostics, which corresponds to the process of determining the health state of the system based on sets of observations (or tests). Such tests are designed specifically to track system behavior and determine whether or not a failure has occurred. In many cases it is impossible to identify a single fault that explains the observations with certainty. Instead, candidate sets of faults are often indicated, and when using applicable models, probabilities or confidence values are associated with the faults to provide additional information. One historic approach to using test observations for diagnosis is to apply a decision tree - sometimes referred to as a fault tree1 [7].

artificial intelligence, machine learning, vertex, (16 more...)

arXiv.org Artificial Intelligence

2508.11031

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Consumer Health (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.66)

Add feedback

Supplementary Material: Identification of Partially Observed Linear Causal Models Jeffrey Adams 1, Niels Richard Hansen

Neural Information Processing SystemsAug-17-2025, 04:55:56 GMT

Let us present the complete theorem first, and then give its proof. We are now ready to present Theorem 1. Theorem 1 But since F induces a different DAG, F is not identified up to trivialities. Proposition 4. F or any graph G there exists F F There are two cases to consider. The backward direction is obvious. This follows from definitions and acyclicity.1.4.5 Proof of Theorem 3 Theorem 3. Then F is identifiable up to trivialities.

artificial intelligence, graph, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.40)

Add feedback

LD-RPMNet: Near-Sensor Diagnosis for Railway Point Machines

Li, Wei, Wu, Xiaochun, Hu, Xiaoxi, Zhang, Yuxuan, Bader, Sebastian, Huang, Yuhan

arXiv.org Artificial IntelligenceAug-15-2025

Near-sensor diagnosis has become increasingly prevalent in industry. This study proposes a lightweight model named LD-RPMNet that integrates Transformers and Convolutional Neural Networks, leveraging both local and global feature extraction to optimize computational efficiency for a practical railway application. The LD-RPMNet introduces a Multi-scale Depthwise Separable Convolution (MDSC) module, which decomposes cross-channel convolutions into pointwise and depthwise convolutions while employing multi-scale kernels to enhance feature extraction. Meanwhile, a Broadcast Self-Attention (BSA) mechanism is incorporated to simplify complex matrix multiplications and improve computational efficiency. Experimental results based on collected sound signals during the operation of railway point machines demonstrate that the optimized model reduces parameter count and computational complexity by 50% while improving diagnostic accuracy by nearly 3%, ultimately achieving an accuracy of 98.86%. This demonstrates the possibility of near-sensor fault diagnosis applications in railway point machines.

artificial intelligence, expert system, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/SAS65169.2025.11105111

2506.06346

Country: Asia > China (0.70)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Ground > Rail (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

NEURAL: Attention-Guided Pruning for Unified Multimodal Resource-Constrained Clinical Evaluation

Joshi, Devvrat, Rekik, Islem

arXiv.org Artificial IntelligenceAug-14-2025

The rapid growth of multimodal medical imaging data presents significant storage and transmission challenges, particularly in resource-constrained clinical settings. We propose NEURAL, a novel framework that addresses this by using semantics-guided data compression. Our approach repurposes cross-attention scores between the image and its radiological report from a fine-tuned generative vision-language model to structurally prune chest X-rays, preserving only diagnostically critical regions. This process transforms the image into a highly compressed, graph representation. This unified graph-based representation fuses the pruned visual graph with a knowledge graph derived from the clinical report, creating a universal data structure that simplifies downstream modeling. Validated on the MIMIC-CXR and CheXpert Plus dataset for pneumonia detection, NEURAL achieves a 93.4-97.7\% reduction in image data size while maintaining a high diagnostic performance of 0.88-0.95 AUC, outperforming other baseline models that use uncompressed data. By creating a persistent, task-agnostic data asset, NEURAL resolves the trade-off between data size and clinical utility, enabling efficient workflows and teleradiology without sacrificing performance. Our NEURAL code is available at https://github.com/basiralab/NEURAL.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.09715

Country:

Asia (0.46)
Europe (0.28)
North America (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback