Diagnosis
Co-training partial domain adaptation networks for industrial Fault Diagnosis
The partial domain adaptation (PDA) challenge is a prevalent issue in industrial fault diagnosis. Drawing inspiration from traditional classification settings where such partial challenge is not a concern, we propose a novel PDA framework called Interactive Residual Domain Adaptation Networks (IRDAN), which introduces domain-wise models for each domain to provide a new perspective for the PDA challenge. Each domain-wise model is equipped with a residual domain adaptation (RDA) block to mitigate the ADP problem. Additionally, we introduce a confident information flow via an interactive learning strategy, training the modules of IRDAN sequentially to avoid cross-interference. We also establish a reliable stopping criterion for selecting the best-performing model, ensuring practical usability in real-world applications. Experiments have demonstrated the superior performance of the proposed IRDAN.
Use Digital Twins to Support Fault Diagnosis From System-level Condition-monitoring Data
Court, Killian Mc, Court, Xavier Mc, Du, Shijia, Zeng, Zhiguo
Deep learning models have created great opportunities for data-driven fault diagnosis but they require large amount of labeled failure data for training. In this paper, we propose to use a digital twin to support developing data-driven fault diagnosis model to reduce the amount of failure data used in the training process. The developed fault diagnosis models are also able to diagnose component-level failures based on system-level condition-monitoring data. The proposed framework is evaluated on a real-world robot system. The results showed that the deep learning model trained by digital twins is able to diagnose the locations and modes of 9 faults/failure from $4$ different motors. However, the performance of the model trained by a digital twin can still be improved, especially when the digital twin model has some discrepancy with the real system.
Identifying General Mechanism Shifts in Linear Causal Representations
Chen, Tianyu, Bello, Kevin, Locatello, Francesco, Aragam, Bryon, Ravikumar, Pradeep
We consider the linear causal representation learning setting where we observe a linear mixing of $d$ unknown latent factors, which follow a linear structural causal model. Recent work has shown that it is possible to recover the latent factors as well as the underlying structural causal model over them, up to permutation and scaling, provided that we have at least $d$ environments, each of which corresponds to perfect interventions on a single latent node (factor). After this powerful result, a key open problem faced by the community has been to relax these conditions: allow for coarser than perfect single-node interventions, and allow for fewer than $d$ of them, since the number of latent factors $d$ could be very large. In this work, we consider precisely such a setting, where we allow a smaller than $d$ number of environments, and also allow for very coarse interventions that can very coarsely \textit{change the entire causal graph over the latent factors}. On the flip side, we relax what we wish to extract to simply the \textit{list of nodes that have shifted between one or more environments}. We provide a surprising identifiability result that it is indeed possible, under some very mild standard assumptions, to identify the set of shifted nodes. Our identifiability proof moreover is a constructive one: we explicitly provide necessary and sufficient conditions for a node to be a shifted node, and show that we can check these conditions given observed data. Our algorithm lends itself very naturally to the sample setting where instead of just interventional distributions, we are provided datasets of samples from each of these distributions. We corroborate our results on both synthetic experiments as well as an interesting psychometric dataset. The code can be found at https://github.com/TianyuCodings/iLCS.
AI-Driven Approaches for Glaucoma Detection -- A Comprehensive Review
Hagiwara, Yuki, Ciora, Octavia-Andreea, Monnet, Maureen, Lancho, Gino, Lorenz, Jeanette Miriam
The diagnosis of glaucoma plays a critical role in the management and treatment of this vision-threatening disease. Glaucoma is a group of eye diseases that cause blindness by damaging the optic nerve at the back of the eye. Often called "silent thief of sight", it exhibits no symptoms during the early stages. Therefore, early detection is crucial to prevent vision loss. With the rise of Artificial Intelligence (AI), particularly Deep Learning (DL) techniques, Computer-Aided Diagnosis (CADx) systems have emerged as promising tools to assist clinicians in accurately diagnosing glaucoma early. This paper aims to provide a comprehensive overview of AI techniques utilized in CADx systems for glaucoma diagnosis. Through a detailed analysis of current literature, we identify key gaps and challenges in these systems, emphasizing the need for improved safety, reliability, interpretability, and explainability. By identifying research gaps, we aim to advance the field of CADx systems especially for the early diagnosis of glaucoma, in order to prevent any potential loss of vision.
A General-Purpose Multimodal Foundation Model for Dermatology
Yan, Siyuan, Yu, Zhen, Primiero, Clare, Vico-Alonso, Cristina, Wang, Zhonghua, Yang, Litao, Tschandl, Philipp, Hu, Ming, Tan, Gin, Tang, Vincent, Ng, Aik Beng, Powell, David, Bonnington, Paul, See, Simon, Janda, Monika, Mar, Victoria, Kittler, Harald, Soyer, H. Peter, Ge, Zongyuan
Diagnosing and treating skin diseases require advanced visual skills across multiple domains and the ability to synthesize information from various imaging modalities. Current deep learning models, while effective at specific tasks such as diagnosing skin cancer from dermoscopic images, fall short in addressing the complex, multimodal demands of clinical practice. Here, we introduce PanDerm, a multimodal dermatology foundation model pretrained through self-supervised learning on a dataset of over 2 million real-world images of skin diseases, sourced from 11 clinical institutions across 4 imaging modalities. We evaluated PanDerm on 28 diverse datasets covering a range of clinical tasks, including skin cancer screening, phenotype assessment and risk stratification, diagnosis of neoplastic and inflammatory skin diseases, skin lesion segmentation, change monitoring, and metastasis prediction and prognosis. PanDerm achieved state-of-the-art performance across all evaluated tasks, often outperforming existing models even when using only 5-10% of labeled data. PanDerm's clinical utility was demonstrated through reader studies in real-world clinical settings across multiple imaging modalities. It outperformed clinicians by 10.2% in early-stage melanoma detection accuracy and enhanced clinicians' multiclass skin cancer diagnostic accuracy by 11% in a collaborative human-AI setting. Additionally, PanDerm demonstrated robust performance across diverse demographic factors, including different body locations, age groups, genders, and skin tones. The strong results in benchmark evaluations and real-world clinical scenarios suggest that PanDerm could enhance the management of skin diseases and serve as a model for developing multimodal foundation models in other medical specialties, potentially accelerating the integration of AI support in healthcare.
SemiHVision: Enhancing Medical Multimodal Models with a Semi-Human Annotated Dataset and Fine-Tuned Instruction Generation
Wang, Junda, Ting, Yujan, Chen, Eric Z., Tran, Hieu, Yu, Hong, Huang, Weijing, Chen, Terrence
Multimodal large language models (MLLMs) have made significant strides, yet they face challenges in the medical domain due to limited specialized knowledge. While recent medical MLLMs demonstrate strong performance in lab settings, they often struggle in real-world applications, highlighting a substantial gap between research and practice. In this paper, we seek to address this gap at various stages of the end-to-end learning pipeline, including data collection, model fine-tuning, and evaluation. At the data collection stage, we introduce SemiHVision, a dataset that combines human annotations with automated augmentation techniques to improve both medical knowledge representation and diagnostic reasoning. For model fine-tuning, we trained PMC-Cambrian-8B-AN over 2400 H100 GPU hours, resulting in performance that surpasses public medical models like HuatuoGPT-Vision-34B (79.0% vs. 66.7%) and private general models like Claude3-Opus (55.7%) on traditional benchmarks such as SLAKE and VQA-RAD. In the evaluation phase, we observed that traditional benchmarks cannot accurately reflect realistic clinical task capabilities. To overcome this limitation and provide more targeted guidance for model evaluation, we introduce the JAMA Clinical Challenge, a novel benchmark specifically designed to evaluate diagnostic reasoning. On this benchmark, PMC-Cambrian-AN achieves state-of-the-art performance with a GPT-4 score of 1.29, significantly outperforming HuatuoGPT-Vision-34B (1.13) and Claude3-Opus (1.17), demonstrating its superior diagnostic reasoning abilities.
Enhancing AI Accessibility in Veterinary Medicine: Linking Classifiers and Electronic Health Records
Kong, Chun Yin, Vasquez, Picasso, Farhoodimoghadam, Makan, Brandt, Chris, Brown, Titus C., Reagan, Krystle L., Zwingenberger, Allison, Keller, Stefan M.
Background: In the rapidly evolving landscape of veterinary healthcare, integrating machine learning (ML) clinical decision-making tools with electronic health records (EHRs) promises to improve diagnostic accuracy and patient care. However, the seamless integration of ML classifiers into existing EHRs in veterinary medicine is frequently hindered by the rigidity of EHR systems or the limited availability of IT resources. Results: To address this shortcoming, we present Anna, a freely-available software solution that provides ML classifier results for EHR laboratory data in real-time. Anna is a standalone platform developed in Python, designed to host ML classifiers, retrieve patient-specific data from an EHR system, generate classifier results and return these results to the EHR for display. Anna merges results from different diagnostic tests according to user-defined temporal criteria and determines whether the data are sufficient for a given classifier. Because Anna is a stand-alone platform, it does not require substantial modifications to the existing EHR, allowing for easy integration into existing computing infrastructure. To demonstrate Anna's versatility, we implemented three previously published ML classifiers to predict a diagnosis of hypoadrenocorticism, leptospirosis, or a portosystemic shunt in dogs. Conclusion: Anna is an open-source tool designed to improve the accessibility of ML classifiers for the veterinary community. Its flexible architecture supports the integration of classifiers developed in various programming languages and with diverse environment requirements.
Preliminary Evaluation of an Ultrasound-Guided Robotic System for Autonomous Percutaneous Intervention
Mohan, Pratima, Agrawal, Aayush, Patel, Niravkumar A.
Cancer cases have been rising globally, resulting in nearly 10 million deaths in 2023. Biopsy, crucial for diagnosis, is often performed under ultrasound (US) guidance, demanding precise hand coordination and cognitive decision-making. Robot-assisted interventions have shown improved accuracy in lesion targeting by addressing challenges such as noisy 2D images and maintaining consistent probe-to-surface contact. Recent research has focused on fully autonomous robotic US systems to enable standardized diagnostic procedures and reproducible US-guided therapy. This study presents a fully autonomous system for US-guided needle placement capable of performing end-to-end clinical workflow. The system autonomously: 1) identifies the liver region on the patient's abdomen surface, 2) plans and executes the US scanning path using impedance control, 3) localizes lesions from the US images in real-time, and 4) targets the identified lesions, all without human intervention. This study evaluates both position and impedance-controlled systems. Validation on agar phantoms demonstrated a targeting error of 5.74 +- 2.70 mm, highlighting its potential for accurately targeting tumors larger than 5 mm. Achieved results show its potential for a fully autonomous system for US-guided biopsies.
Decision trees as partitioning machines to characterize their generalization properties
Decision trees are popular machine learning models that are simple to build and easy to interpret. Even though algorithms to learn decision trees date back to almost 50 years, key properties affecting their generalization error are still weakly bounded. Hence, we revisit binary decision trees on real-valued features from the perspective of partitions of the data. We introduce the notion of partitioning function, and we relate it to the growth function and to the VC dimension. Using this new concept, we are able to find the exact VC dimension of decision stumps, which is given by the largest integer d such that 2\ell \ge \binom{d}{\floor{\frac{d}{2}}}, where \ell is the number of real-valued features.
Exploring the Whole Rashomon Set of Sparse Decision Trees
In any given machine learning problem, there may be many models that could explain the data almost equally well. However, most learning algorithms return only one of these models, leaving practitioners with no practical way to explore alternative models that might have desirable properties beyond what could be expressed within a loss function. The Rashomon set is the set of these all almost-optimal models. Rashomon sets can be extremely complicated, particularly for highly nonlinear function classes that allow complex interaction terms, such as decision trees. We provide the first technique for completely enumerating the Rashomon set for sparse decision trees; in fact, our work provides the first complete enumeration of any Rashomon set for a non-trivial problem with a highly nonlinear discrete function class.