AITopics | Diagnosis

Collaborating Authors

Diagnosis

News Overviews Instructional Materials AI-Alerts Classics

SketchBoost: Fast Gradient Boosted Decision Tree for Multioutput Problems

Neural Information Processing SystemsJan-18-2025, 07:38:20 GMT

Gradient Boosted Decision Tree (GBDT) is a widely-used machine learning algorithm that has been shown to achieve state-of-the-art results on many standard data science problems. We are interested in its application to multioutput problems when the output is highly multidimensional. Although there are highly effective GBDT implementations, their scalability to such problems is still unsatisfactory. In this paper, we propose novel methods aiming to accelerate the training process of GBDT in the multioutput scenario. The idea behind these methods lies in the approximate computation of a scoring function used to find the best split of decision trees.

fast gradient boosted decision tree, gradient boosted decision tree, sketchboost, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.66)

Add feedback

Interventions, Where and How? Experimental Design for Causal Models at Scale

Neural Information Processing SystemsJan-18-2025, 02:39:48 GMT

Causal discovery from observational and interventional data is challenging due to limited data and non-identifiability which introduces uncertainties in estimating the underlying structural causal model (SCM). Incorporating these uncertainties and selecting optimal experiments (interventions) to perform can help to identify the true SCM faster. Existing methods in experimental design for causal discovery from limited data either rely on linear assumptions for the SCM or select only the intervention target. In this paper, we incorporate recent advances in Bayesian causal discovery into the Bayesian optimal experimental design framework, which allows for active causal discovery of nonlinear, large SCMs, while selecting both the target and the value to intervene with. We demonstrate the performance of the proposed method on synthetic graphs (Erdos-Rènyi, Scale Free) for both linear and nonlinear SCMs as well as on the \emph{in-silico} single-cell gene regulatory network dataset, DREAM.

causal discovery, experimental design, intervention, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.92)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.66)

Add feedback

In the Picture: Medical Imaging Datasets, Artifacts, and their Living Review

Jiménez-Sánchez, Amelia, Avlona, Natalia-Rozalia, de Boer, Sarah, Campello, Víctor M., Feragen, Aasa, Ferrante, Enzo, Ganz, Melanie, Gichoya, Judy Wawira, González, Camila, Groefsema, Steff, Hering, Alessa, Hulman, Adam, Joskowicz, Leo, Juodelyte, Dovile, Kandemir, Melih, Kooi, Thijs, Lérida, Jorge del Pozo, Li, Livie Yumeng, Pacheco, Andre, Rädsch, Tim, Reyes, Mauricio, Sourget, Théo, van Ginneken, Bram, Wen, David, Weng, Nina, Xu, Jack Junchi, Zając, Hubert Dariusz, Zuluaga, Maria A., Cheplygina, Veronika

arXiv.org Artificial IntelligenceJan-18-2025

Datasets play a critical role in medical imaging research, yet issues such as label quality, shortcuts, and metadata are often overlooked. This lack of attention may harm the generalizability of algorithms and, consequently, negatively impact patient outcomes. While existing medical imaging literature reviews mostly focus on machine learning (ML) methods, with only a few focusing on datasets for specific applications, these reviews remain static -- they are published once and not updated thereafter. This fails to account for emerging evidence, such as biases, shortcuts, and additional annotations that other researchers may contribute after the dataset is published. We refer to these newly discovered findings of datasets as research artifacts. To address this gap, we propose a living review that continuously tracks public datasets and their associated research artifacts across multiple medical imaging applications. Our approach includes a framework for the living review to monitor data documentation artifacts, and an SQL database to visualize the citation relationships between research artifact and dataset. Lastly, we discuss key considerations for creating medical imaging datasets, review best practices for data annotation, discuss the significance of shortcuts and demographic diversity, and emphasize the importance of managing datasets throughout their entire lifecycle. Our demo is publicly available at http://130.226.140.142.

artificial intelligence, deep learning, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2501.10727

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > Germany (0.04)
(18 more...)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.67)

Add feedback

Class Incremental Fault Diagnosis under Limited Fault Data via Supervised Contrastive Knowledge Distillation

Zhang, Hanrong, Yao, Yifei, Wang, Zixuan, Su, Jiayuan, Li, Mengxuan, Peng, Peng, Wang, Hongwei

arXiv.org Artificial IntelligenceJan-16-2025

--Class-incremental fault diagnosis requires a model to adapt to new fault classes while retaining previous knowledge. However, limited research exists for imbalanced and long-tailed data. Extracting discriminative features from few-shot fault data is challenging, and adding new fault classes often demands costly model retraining. T o tackle these issues, we introduce a Supervised Contrastive knowledge distiLlation for class Incremental Fault Diagnosis (SCLIFD) framework proposing supervised contrastive knowledge distillation for improved representation learning capability and less forgetting, a novel prioritized exemplar selection method for sample replay to alleviate catastrophic forgetting, and the Random Forest Classifier to address the class imbalance. Extensive experimentation on simulated and real-world industrial datasets across various imbalance ratios demonstrates the superiority of SCLIFD over existing approaches. Data-driven fault diagnosis techniques have gained significant prominence over the past two decades [1-5]. However, most of them necessitate sufficient training data to achieve reliable modeling performance[6-9]. Unfortunately, fault data is typically limited in comparison to normal data. This is because engineering equipment primarily operates under normal conditions, and the probabilities of faults vary across different working environments. Besides, fault simulation experiments are costly and inevitably deviate to some extent from real industrial environments. These possible reasons consequently contribute to class imbalance and a long-tailed distribution among different conditions [10]. The performance of the model typically suffers as it tends to prioritize the normal class, consequently neglecting fault classes or tail classes.

dataset, fault diagnosis, learning, (14 more...)

arXiv.org Artificial Intelligence

2501.09525

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Tennessee (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Education > Educational Setting (0.93)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Quantized Training of Gradient Boosting Decision Trees

Neural Information Processing SystemsJan-13-2025, 14:11:58 GMT

Recent years have witnessed significant success in Gradient Boosting Decision Trees (GBDT) for a wide range of machine learning applications. Generally, a consensus about GBDT's training algorithms is gradients and statistics are computed based on high-precision floating points. In this paper, we investigate an essentially important question which has been largely ignored by the previous literature - how many bits are needed for representing gradients in training GBDT? To solve this mystery, we propose to quantize all the high-precision gradients in a very simple yet effective way in the GBDT's training algorithm. Surprisingly, both our theoretical analysis and empirical studies show that the necessary precisions of gradients without hurting any performance can be quite low, e.g., 2 or 3 bits.

decision tree, gbdt, quantized training, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.64)

Add feedback

A Survey of Embodied AI in Healthcare: Techniques, Applications, and Opportunities

Liu, Yihao, Cao, Xu, Chen, Tingting, Jiang, Yankai, You, Junjie, Wu, Minghua, Wang, Xiaosong, Feng, Mengling, Jin, Yaochu, Chen, Jintai

arXiv.org Artificial IntelligenceJan-13-2025

Healthcare systems worldwide face persistent challenges in efficiency, accessibility, and personalization. Powered by modern AI technologies such as multimodal large language models and world models, Embodied AI (EmAI) represents a transformative frontier, offering enhanced autonomy and the ability to interact with the physical world to address these challenges. As an interdisciplinary and rapidly evolving research domain, "EmAI in healthcare" spans diverse fields such as algorithms, robotics, and biomedicine. This complexity underscores the importance of timely reviews and analyses to track advancements, address challenges, and foster cross-disciplinary collaboration. In this paper, we provide a comprehensive overview of the "brain" of EmAI for healthcare, wherein we introduce foundational AI algorithms for perception, actuation, planning, and memory, and focus on presenting the healthcare applications spanning clinical interventions, daily care & companionship, infrastructure support, and biomedical research. Despite its promise, the development of EmAI for healthcare is hindered by critical challenges such as safety concerns, gaps between simulation platforms and real-world applications, the absence of standardized benchmarks, and uneven progress across interdisciplinary domains. We discuss the technical barriers and explore ethical considerations, offering a forward-looking perspective on the future of EmAI in healthcare. A hierarchical framework of intelligent levels for EmAI systems is also introduced to guide further development. By providing systematic insights, this work aims to inspire innovation and practical applications, paving the way for a new era of intelligent, patient-centered healthcare.

large language model, machine learning, real time system, (23 more...)

arXiv.org Artificial Intelligence

2501.07468

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Illinois (0.27)
North America > United States > California (0.27)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
(21 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Data Science > Data Mining (1.00)
(17 more...)

Add feedback

Knowledge Distillation and Enhanced Subdomain Adaptation Using Graph Convolutional Network for Resource-Constrained Bearing Fault Diagnosis

Kavianpour, Mohammadreza, Kavianpour, Parisa, Ramezani, Amin, Beheshti, Mohammad TH

arXiv.org Artificial IntelligenceJan-13-2025

Bearing fault diagnosis under varying working conditions faces challenges, including a lack of labeled data, distribution discrepancies, and resource constraints. To address these issues, we propose a progressive knowledge distillation framework that transfers knowledge from a complex teacher model, utilizing a Graph Convolutional Network (GCN) with Autoregressive moving average (ARMA) filters, to a compact and efficient student model. To mitigate distribution discrepancies and labeling uncertainty, we introduce Enhanced Local Maximum Mean Squared Discrepancy (ELMMSD), which leverages mean and variance statistics in the Reproducing Kernel Hilbert Space (RKHS) and incorporates a priori probability distributions between labels. This approach increases the distance between clustering centers, bridges subdomain gaps, and enhances subdomain alignment reliability. Experimental results on benchmark datasets (CWRU and JNU) demonstrate that the proposed method achieves superior diagnostic accuracy while significantly reducing computational costs. Comprehensive ablation studies validate the effectiveness of each component, highlighting the robustness and adaptability of the approach across diverse working conditions.

artificial intelligence, expert system, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.07173

Genre: Research Report (1.00)

Industry:

Education (0.53)
Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Government developing AI-based system to detect vacant houses

The Japan TimesJan-12-2025, 04:50:00 GMT

Japan's infrastructure ministry is developing a system that combines administrative information held by local governments to detect vacant houses using artificial intelligence. Based on information such as water usage, basic resident registers and real estate registries, the system uses AI to calculate the probability that a building is unoccupied. For example, if the building is an old wooden house with very low water use and only one elderly resident registered, the system displays a high probability that the house is unoccupied. As some vacant houses are difficult to identify from the exterior alone, the aim of the new system is to detect them at an early stage and make them available for sale or rent, or demolish them before they collapse.

ai-based system, detect vacant house, government, (1 more...)

The Japan Times

Country: Asia > Japan (0.68)

Industry: Banking & Finance > Real Estate (0.32)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.40)

Add feedback

MedGrad E-CLIP: Enhancing Trust and Transparency in AI-Driven Skin Lesion Diagnosis

Kamal, Sadia, Oates, Tim

arXiv.org Artificial IntelligenceJan-12-2025

As deep learning models gain attraction in medical data, ensuring transparent and trustworthy decision-making is essential. In skin cancer diagnosis, while advancements in lesion detection and classification have improved accuracy, the black-box nature of these methods poses challenges in understanding their decision processes, leading to trust issues among physicians. This study leverages the CLIP (Contrastive Language-Image Pretraining) model, trained on different skin lesion datasets, to capture meaningful relationships between visual features and diagnostic criteria terms. To further enhance transparency, we propose a method called MedGrad E-CLIP, which builds on gradient-based E-CLIP by incorporating a weighted entropy mechanism designed for complex medical imaging like skin lesions. This approach highlights critical image regions linked to specific diagnostic descriptions. The developed integrated pipeline not only classifies skin lesions by matching corresponding descriptions but also adds an essential layer of explainability developed especially for medical data. By visually explaining how different features in an image relates to diagnostic criteria, this approach demonstrates the potential of advanced vision-language models in medical image analysis, ultimately improving transparency, robustness, and trust in AI-driven diagnostic systems.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.06887

Country:

North America > United States > Maryland (0.29)
North America > Canada (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Dermatology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.38)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.88)

Add feedback

The Role of Machine Learning in Congenital Heart Disease Diagnosis: Datasets, Algorithms, and Insights

Khan, Khalil, Ullah, Farhan, Syed, Ikram, Ullah, Irfan

arXiv.org Artificial IntelligenceJan-8-2025

Congenital heart disease is among the most common fetal abnormalities and birth defects. Despite identifying numerous risk factors influencing its onset, a comprehensive understanding of its genesis and management across diverse populations remains limited. Recent advancements in machine learning have demonstrated the potential for leveraging patient data to enable early congenital heart disease detection. Over the past seven years, researchers have proposed various data-driven and algorithmic solutions to address this challenge. This paper presents a systematic review of congential heart disease recognition using machine learning, conducting a meta-analysis of 432 references from leading journals published between 2018 and 2024. A detailed investigation of 74 scholarly works highlights key factors, including databases, algorithms, applications, and solutions. Additionally, the survey outlines reported datasets used by machine learning experts for congenital heart disease recognition. Using a systematic literature review methodology, this study identifies critical challenges and opportunities in applying machine learning to congenital heart disease.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.04493

Country:

Asia > China > Shanxi Province (0.14)
Asia > Kazakhstan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(19 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.67)

Add feedback