AITopics | Diagnosis

Collaborating Authors

Diagnosis

News Overviews Instructional Materials AI-Alerts Classics

Sequential Diagnosis with Language Models

Nori, Harsha, Daswani, Mayank, Kelly, Christopher, Lundberg, Scott, Ribeiro, Marco Tulio, Wilson, Marc, Liu, Xiaoxuan, Sounderajah, Viknesh, Carlson, Jonathan, Lungren, Matthew P, Gross, Bay, Hames, Peter, Suleyman, Mustafa, King, Dominic, Horvitz, Eric

arXiv.org Artificial IntelligenceJul-3-2025

Artificial intelligence holds great promise for expanding access to expert medical knowledge and reasoning. However, most evaluations of language models rely on static vignettes and multiple-choice questions that fail to reflect the complexity and nuance of evidence-based medicine in real-world settings. In clinical practice, physicians iteratively formulate and revise diagnostic hypotheses, adapting each subsequent question and test to what they've just learned, and weigh the evolving evidence before committing to a final diagnosis. To emulate this iterative process, we introduce the Sequential Diagnosis Benchmark, which transforms 304 diagnostically challenging New England Journal of Medicine clinicopathological conference (NEJM-CPC) cases into stepwise diagnostic encounters. A physician or AI begins with a short case abstract and must iteratively request additional details from a gatekeeper model that reveals findings only when explicitly queried. Performance is assessed not just by diagnostic accuracy but also by the cost of physician visits and tests performed. We also present the MAI Diagnostic Orchestrator (MAI-DxO), a model-agnostic orchestrator that simulates a panel of physicians, proposes likely differential diagnoses and strategically selects high-value, cost-effective tests. When paired with OpenAI's o3 model, MAI-DxO achieves 80% diagnostic accuracy--four times higher than the 20% average of generalist physicians. MAI-DxO also reduces diagnostic costs by 20% compared to physicians, and 70% compared to off-the-shelf o3. When configured for maximum accuracy, MAI-DxO achieves 85.5% accuracy. These performance gains with MAI-DxO generalize across models from the OpenAI, Gemini, Claude, Grok, DeepSeek, and Llama families. We highlight how AI systems, when guided to think iteratively and act judiciously, can advance diagnostic precision and cost-effectiveness in clinical care.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.22405

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

An Uncertainty-Aware Dynamic Decision Framework for Progressive Multi-Omics Integration in Classification Tasks

Mu, Nan, Yang, Hongbo, Zhao, Chen

arXiv.org Artificial IntelligenceJul-3-2025

Background and Objective: High-throughput multi-omics technologies have proven invaluable for elucidating disease mechanisms and enabling early diagnosis. However, the high cost of multi-omics profiling imposes a significant economic burden, with over reliance on full omics data potentially leading to unnecessary resource consumption. To address these issues, we propose an uncertainty-aware, multi-view dynamic decision framework for omics data classification that aims to achieve high diagnostic accuracy while minimizing testing costs. Methodology: At the single-omics level, we refine the activation functions of neural networks to generate Dirichlet distribution parameters, utilizing subjective logic to quantify both the belief masses and uncertainty mass of classification results. Belief mass reflects the support of a specific omics modality for a disease class, while the uncertainty parameter captures limitations in data quality and model discriminability, providing a more trustworthy basis for decision-making. At the multi omics level, we employ a fusion strategy based on Dempster-Shafer theory to integrate heterogeneous modalities, leveraging their complementarity to boost diagnostic accuracy and robustness. A dynamic decision mechanism is then applied that omics data are incrementally introduced for each patient until either all data sources are utilized or the model confidence exceeds a predefined threshold, potentially before all data sources are utilized. Results and Conclusion: We evaluate our approach on four benchmark multi-omics datasets, ROSMAP, LGG, BRCA, and KIPAN. In three datasets, over 50% of cases achieved accurate classification using a single omics modality, effectively reducing redundant testing. Meanwhile, our method maintains diagnostic performance comparable to full-omics models and preserves essential biological insights.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

2507.01032

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

Towards transparent and data-driven fault detection in manufacturing: A case study on univariate, discrete time series

Hofmann, Bernd, Bruendl, Patrick, Nguyen, Huong Giang, Franke, Joerg

arXiv.org Artificial IntelligenceJul-2-2025

Ensuring consistent product quality in modern manufacturing is crucial, particularly in safety-critical applications. Conventional quality control approaches, reliant on manually defined thresholds and features, lack adaptability to the complexity and variability inherent in production data and necessitate extensive domain expertise. Conversely, data-driven methods, such as machine learning, demonstrate high detection performance but typically function as black-box models, thereby limiting their acceptance in industrial environments where interpretability is paramount. This paper introduces a methodology for industrial fault detection, which is both data-driven and transparent. The approach integrates a supervised machine learning model for multi-class fault classification, Shapley Additive Explanations for post-hoc interpretability, and a do-main-specific visualisation technique that maps model explanations to operator-interpretable features. Furthermore, the study proposes an evaluation methodology that assesses model explanations through quantitative perturbation analysis and evaluates visualisations by qualitative expert assessment. The approach was applied to the crimping process, a safety-critical joining technique, using a dataset of univariate, discrete time series. The system achieves a fault detection accuracy of 95.9 %, and both quantitative selectivity analysis and qualitative expert evaluations confirmed the relevance and inter-pretability of the generated explanations. This human-centric approach is designed to enhance trust and interpretability in data-driven fault detection, thereby contributing to applied system design in industrial quality control.

artificial intelligence, expert system, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.00102

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Evaluating Rare Disease Diagnostic Performance in Symptom Checkers: A Synthetic Vignette Simulation Approach

Nishibayashi, Takashi, Kanazawa, Seiji, Yamada, Kumpei

arXiv.org Artificial IntelligenceJul-1-2025

Symptom Checkers (SCs) provide medical information tailored to user symptoms. A critical challenge in SC development is preventing unexpected performance degradation for individual diseases, especially rare diseases, when updating algorithms. This risk stems from the lack of practical pre-deployment evaluation methods. For rare diseases, obtaining sufficient evaluation data from user feedback is difficult. To evaluate the impact of algorithm updates on the diagnostic performance for individual rare diseases before deployment, this study proposes and validates a novel Synthetic Vignette Simulation Approach. This approach aims to enable this essential evaluation efficiently and at a low cost. To estimate the impact of algorithm updates, we generated synthetic vignettes from disease-phenotype annotations in the Human Phenotype Ontology (HPO), a publicly available knowledge base for rare diseases curated by experts. Using these vignettes, we simulated SC interviews to predict changes in diagnostic performance. The effectiveness of this approach was validated retrospectively by comparing the predicted changes with actual performance metrics using the R-squared ($R^2$) coefficient. Our experiment, covering eight past algorithm updates for rare diseases, showed that the proposed method accurately predicted performance changes for diseases with phenotype frequency information in HPO (n=5). For these updates, we found a strong correlation for both Recall@8 change ($R^2$ = 0.83,$p$ = 0.031) and Precision@8 change ($R^2$ = 0.78,$p$ = 0.047). Our proposed method enables the pre-deployment evaluation of SC algorithm changes for individual rare diseases. This evaluation is based on a publicly available medical knowledge database created by experts, ensuring transparency and explainability for stakeholders. Additionally, SC developers can efficiently improve diagnostic performance at a low cost.

artificial intelligence, rare disease, vignette, (11 more...)

arXiv.org Artificial Intelligence

2506.1975

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)

Add feedback

Fault-Tolerant Spacecraft Attitude Determination using State Estimation Techniques

Chidambaram, B., Hilbert, A., Silva, M.

arXiv.org Artificial IntelligenceJun-27-2025

--The extended and unscented Kalman filter, and the particle filter provide a robust framework for fault-tolerant attitude estimation on spacecraft. This paper explores how each filter performs for a large satellite in a low earth orbit. Additionally, various techniques, built on these filters, for fault detection, isolation and recovery from erroneous sensor measurements, are analyzed. Key results from this analysis include filter performance for various fault modes. Communication satellites, satellites conducting scientific research, and reentry vehicles are all examples of spacecraft that need to predict and control their attitude.

artificial intelligence, information fusion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.21016

Country: North America > United States > California > Santa Clara County > Palo Alto (0.14)

Genre: Research Report (0.40)

Industry: Aerospace & Defense (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.31)

Add feedback

Evidence-based diagnostic reasoning with multi-agent copilot for human pathology

Chen, Chengkuan, Weishaupt, Luca L., Williamson, Drew F. K., Chen, Richard J., Ding, Tong, Chen, Bowen, Vaidya, Anurag, Le, Long Phi, Jaume, Guillaume, Lu, Ming Y., Mahmood, Faisal

arXiv.org Artificial IntelligenceJun-27-2025

Pathology is experiencing rapid digital transformation driven by whole-slide imaging and artificial intelligence (AI). While deep learning-based computational pathology has achieved notable success, traditional models primarily focus on image analysis without integrating natural language instruction or rich, text-based context. Current multimodal large language models (MLLMs) in computational pathology face limitations, including insufficient training data, inadequate support and evaluation for multi-image understanding, and a lack of autonomous, diagnostic reasoning capabilities. To address these limitations, we introduce PathChat+, a new MLLM specifically designed for human pathology, trained on over 1 million diverse, pathology-specific instruction samples and nearly 5.5 million question answer turns. Extensive evaluations across diverse pathology benchmarks demonstrated that PathChat+ substantially outperforms the prior PathChat copilot, as well as both state-of-the-art (SOTA) general-purpose and other pathology-specific models. Furthermore, we present SlideSeek, a reasoning-enabled multi-agent AI system leveraging PathChat+ to autonomously evaluate gigapixel whole-slide images (WSIs) through iterative, hierarchical diagnostic reasoning, reaching high accuracy on DDxBench, a challenging open-ended differential diagnosis benchmark, while also capable of generating visually grounded, humanly-interpretable summary reports.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.20964

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > Experimental Study (0.94)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Carcinoma (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

LLM-Driven Medical Document Analysis: Enhancing Trustworthy Pathology and Differential Diagnosis

Kang, Lei, Fu, Xuanshuo, Terrades, Oriol Ramos, Vazquez-Corral, Javier, Valveny, Ernest, Karatzas, Dimosthenis

arXiv.org Artificial IntelligenceJun-25-2025

Medical document analysis plays a crucial role in extracting essential clinical insights from unstructured healthcare records, supporting critical tasks such as differential diagnosis. Determining the most probable condition among overlapping symptoms requires precise evaluation and deep medical expertise. While recent advancements in large language models (LLMs) have significantly enhanced performance in medical document analysis, privacy concerns related to sensitive patient data limit the use of online LLMs services in clinical settings. To address these challenges, we propose a trustworthy medical document analysis platform that fine-tunes a LLaMA-v3 using low-rank adaptation, specifically optimized for differential diagnosis tasks. Our approach utilizes DDXPlus, the largest benchmark dataset for differential diagnosis, and demonstrates superior performance in pathology prediction and variable-length differential diagnosis compared to existing methods. The developed web-based platform allows users to submit their own unstructured medical documents and receive accurate, explainable diagnostic results. By incorporating advanced explainability techniques, the system ensures transparent and reliable predictions, fostering user trust and confidence. Extensive evaluations confirm that the proposed method surpasses current state-of-the-art models in predictive accuracy while offering practical utility in clinical settings. This work addresses the urgent need for reliable, explainable, and privacy-preserving artificial intelligence solutions, representing a significant advancement in intelligent medical document analysis for real-world healthcare applications.

differential diagnosis, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.19702

Country: Europe > Spain (0.28)

Genre: Research Report > Promising Solution (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Cross-Architecture Knowledge Distillation (KD) for Retinal Fundus Image Anomaly Detection on NVIDIA Jetson Nano

Yilmaz, Berk, Aiyengar, Aniruddh

arXiv.org Artificial IntelligenceJun-24-2025

Early and accurate identification of retinal ailments is crucial for averting ocular decline; however, access to dependable diagnostic devices is not often available in low-resourced settings. This project proposes to solve that by developing a lightweight, edge-device deployable disease classifier using cross-architecture knowledge distilling. We first train a high-capacity vision transformer (ViT) teacher model, pre-trained using I-JEPA self-supervised learning, to classify fundus images into four classes: Normal, Diabetic Retinopathy, Glaucoma, and Cataract. We kept an Internet of Things (IoT) focus when compressing to a CNN-based student model for deployment in resource-limited conditions, such as the NVIDIA Jetson Nano. This was accomplished using a novel framework which included a Partitioned Cross-Attention (PCA) projector, a Group-Wise Linear (GL) projector, and a multi-view robust training method. The teacher model has 97.4 percent more parameters than the student model, with it achieving 89 percent classification with a roughly 93 percent retention of the teacher model's diagnostic performance. The retention of clinical classification behavior supports our method's initial aim: compression of the ViT while retaining accuracy. Our work serves as an example of a scalable, AI-driven triage solution for retinal disorders in under-resourced areas.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.1822

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Education (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(2 more...)

Add feedback

Rethinking the Role of Operating Conditions for Learning-based Multi-condition Fault Diagnosis

Han, Pengyu, Liu, Zeyi, Chen, Shijin, Zou, Dongliang, He, Xiao

arXiv.org Artificial IntelligenceJun-24-2025

Multi-condition fault diagnosis is prevalent in industrial systems and presents substantial challenges for conventional diagnostic approaches. The discrepancy in data distributions across different operating conditions degrades model performance when a model trained under one condition is applied to others. With the recent advancements in deep learning, transfer learning has been introduced to the fault diagnosis field as a paradigm for addressing multi-condition fault diagnosis. Among these methods, domain generalization approaches can handle complex scenarios by extracting condition-invariant fault features. Although many studies have considered fault diagnosis in specific multi-condition scenarios, the extent to which operating conditions affect fault information has been scarcely studied, which is crucial. However, the extent to which operating conditions affect fault information has been scarcely studied, which is crucial. When operating conditions have a significant impact on fault features, directly applying domain generalization methods may lead the model to learn condition-specific information, thereby reducing its overall generalization ability. This paper investigates the performance of existing end-to-end domain generalization methods under varying conditions, specifically in variable-speed and variable-load scenarios, using multiple experiments on a real-world gearbox. Additionally, a two-stage diagnostic framework is proposed, aiming to improve fault diagnosis performance under scenarios with significant operating condition impacts. By incorporating a domain-generalized encoder with a retraining strategy, the framework is able to extract condition-invariant fault features while simultaneously alleviating potential overfitting to the source domain. Several experiments on a real-world gearbox dataset are conducted to validate the effectiveness of the proposed approach.

artificial intelligence, fault diagnosis, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2506.1774

Country: Asia > China (0.30)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

BatteryBERT for Realistic Battery Fault Detection Using Point-Masked Signal Modeling

Zhou, Songqi, Liu, Ruixue, Wang, Yixing, Lu, Jia, Jiang, Benben

arXiv.org Artificial IntelligenceJun-23-2025

Accurate fault detection in lithium-ion batteries is essential for the safe and reliable operation of electric vehicles and energy storage systems. However, existing methods often struggle to capture complex temporal dependencies and cannot fully leverage abundant unlabeled data. Although large language models (LLMs) exhibit strong representation capabilities, their architectures are not directly suited to the numerical time-series data common in industrial settings. To address these challenges, we propose a novel framework that adapts BERT-style pretraining for battery fault detection by extending the standard BERT architecture with a customized time-series-to-token representation module and a point-level Masked Signal Modeling (point-MSM) pretraining task tailored to battery applications. This approach enables self-supervised learning on sequential current, voltage, and other charge-discharge cycle data, yielding distributionally robust, context-aware temporal embeddings. We then concatenate these embeddings with battery metadata and feed them into a downstream classifier for accurate fault classification. Experimental results on a large-scale real-world dataset show that models initialized with our pretrained parameters significantly improve both representation quality and classification accuracy, achieving an AUROC of 0.945 and substantially outperforming existing approaches. These findings validate the effectiveness of BERT-style pretraining for time-series fault detection.

fault detection, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.15712

Country: Asia > China (0.16)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Energy > Energy Storage (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback