bacc
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (0.67)
Enhancing Diagnostic Accuracy for Urinary Tract Disease through Explainable SHAP-Guided Feature Selection and Classification
de Oliveira, Filipe Ferreira, Rocha, Matheus Becali, Krohling, Renato A.
In this paper, we propose an approach to support the diagnosis of urinary tract diseases, with a focus on bladder cancer, using SHAP (SHapley Additive exPlanations)-based feature selection to enhance the transparency and effectiveness of predictive models. Six binary classification scenarios were developed to distinguish bladder cancer from other urological and oncological conditions. The algorithms XGBoost, LightGBM, and CatBoost were employed, with hyperparameter optimization performed using Optuna and class balancing with the SMOTE technique. The selection of predictive variables was guided by importance values through SHAP-based feature selection while maintaining or even improving performance metrics such as balanced accuracy, precision, and specificity. The use of explainability techniques (SHAP) for feature selection proved to be an effective approach. The proposed methodology may contribute to the development of more transparent, reliable, and efficient clinical decision support systems, optimizing screening and early diagnosis of urinary tract diseases.
- South America > Brazil (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Health & Medicine > Therapeutic Area > Urology (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Bladder Cancer (0.61)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (0.67)
Lifespan tree of brain anatomy: diagnostic values for motor and cognitive neurodegenerative diseases
Coupé, Pierrick, Mansencal, Boris, Manjón, José V., Péran, Patrice, Meissner, Wassilios G., Tourdias, Thomas, Planche, Vincent
The differential diagnosis of neurodegenerative diseases, characterized by overlapping symptoms, may be challenging. Brain imaging coupled with artificial intelligence has been previously proposed for diagnostic support, but most of these methods have been trained to discriminate only isolated diseases from controls. Here, we develop a novel machine learning framework, named lifespan tree of brain anatomy, dedicated to the differential diagnosis between multiple diseases simultaneously. It integrates the modeling of volume changes for 124 brain structures during the lifespan with non-linear dimensionality reduction and synthetic sampling techniques to create easily interpretable representations of brain anatomy over the course of disease progression. As clinically relevant proof- of-concept applications, we constructed a cognitive lifespan tree of brain anatomy for the differential diagnosis of six causes of neurodegenerative dementia and a motor lifespan tree of brain anatomy for the differential diagnosis of four causes of parkinsonism using 37594 MRI as a training dataset. This original approach enhanced significantly the efficiency of differential diagnosis in the external validation cohort of 1754 cases, outperforming existing state-of-the art machine learning techniques. Lifespan tree holds promise as a valuable tool for differential diagnostic in relevant clinical conditions, especially for diseases still lacking effective biological markers.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Oceania > Australia (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- (11 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
- Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (1.00)
- Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Long-tailed Medical Diagnosis with Relation-aware Representation Learning and Iterative Classifier Calibration
Pan, Li, Zhang, Yupei, Yang, Qiushi, Li, Tan, Chen, Zhen
Recently computer-aided diagnosis has demonstrated promising performance, effectively alleviating the workload of clinicians. However, the inherent sample imbalance among different diseases leads algorithms biased to the majority categories, leading to poor performance for rare categories. Existing works formulated this challenge as a long-tailed problem and attempted to tackle it by decoupling the feature representation and classification. Yet, due to the imbalanced distribution and limited samples from tail classes, these works are prone to biased representation learning and insufficient classifier calibration. To tackle these problems, we propose a new Long-tailed Medical Diagnosis (LMD) framework for balanced medical image classification on long-tailed datasets. In the initial stage, we develop a Relation-aware Representation Learning (RRL) scheme to boost the representation ability by encouraging the encoder to capture intrinsic semantic features through different data augmentations. In the subsequent stage, we propose an Iterative Classifier Calibration (ICC) scheme to calibrate the classifier iteratively. This is achieved by generating a large number of balanced virtual features and fine-tuning the encoder using an Expectation-Maximization manner. The proposed ICC compensates for minority categories to facilitate unbiased classifier optimization while maintaining the diagnostic knowledge in majority classes. Comprehensive experiments on three public long-tailed medical datasets demonstrate that our LMD framework significantly surpasses state-of-the-art approaches. The source code can be accessed at https://github.com/peterlipan/LMD.
- Asia > China > Hong Kong (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Instructional Material > Course Syllabus & Notes (0.35)
- Research Report > Promising Solution (0.34)
UPL: Uncertainty-aware Pseudo-labeling for Imbalance Transductive Node Classification
Teimuri, Mohammad T., Dehghanian, Zahra, Aminian, Gholamali, Rabiee, Hamid R.
Graph-structured datasets often suffer from class imbalance, which complicates node classification tasks. In this work, we address this issue by first providing an upper bound on population risk for imbalanced transductive node classification. We then propose a simple and novel algorithm, Uncertainty-aware Pseudo-labeling (UPL). Our approach leverages pseudo-labels assigned to unlabeled nodes to mitigate the adverse effects of imbalance on classification accuracy. Furthermore, the UPL algorithm enhances the accuracy of pseudo-labeling by reducing training noise of pseudo-labels through a novel uncertainty-aware approach. We comprehensively evaluate the UPL algorithm across various benchmark datasets, demonstrating its superior performance compared to existing state-of-the-art methods.
- North America > United States > Wisconsin (0.05)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Portugal (0.04)
- Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
Spectral Graph Sample Weighting for Interpretable Sub-cohort Analysis in Predictive Models for Neuroimaging
Paschali, Magdalini, Jiang, Yu Hang, Siegel, Spencer, Gonzalez, Camila, Pohl, Kilian M., Chaudhari, Akshay, Zhao, Qingyu
Recent advancements in medicine have confirmed that brain disorders often comprise multiple subtypes of mechanisms, developmental trajectories, or severity levels. Such heterogeneity is often associated with demographic aspects (e.g., sex) or disease-related contributors (e.g., genetics). Thus, the predictive power of machine learning models used for symptom prediction varies across subjects based on such factors. To model this heterogeneity, one can assign each training sample a factor-dependent weight, which modulates the subject's contribution to the overall objective loss function. To this end, we propose to model the subject weights as a linear combination of the eigenbases of a spectral population graph that captures the similarity of factors across subjects. In doing so, the learned weights smoothly vary across the graph, highlighting sub-cohorts with high and low predictability. Our proposed sample weighting scheme is evaluated on two tasks. First, we predict initiation of heavy alcohol drinking in young adulthood from imaging and neuropsychological measures from the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA). Next, we detect Dementia vs. Mild Cognitive Impairment (MCI) using imaging and demographic measurements in subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Compared to existing sample weighting schemes, our sample weights improve interpretability and highlight sub-cohorts with distinct characteristics and varying model accuracy.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Santa Clara County > Stanford (0.05)
- (4 more...)
Multi-View and Multi-Scale Alignment for Contrastive Language-Image Pre-training in Mammography
Du, Yuexi, Onofrey, John, Dvornek, Nicha C.
Contrastive Language-Image Pre-training (CLIP) shows promise in medical image analysis but requires substantial data and computational resources. Due to these restrictions, existing CLIP applications in medical imaging focus mainly on modalities like chest X-rays that have abundant image-report data available, leaving many other important modalities under-explored. Here, we propose the first adaptation of the full CLIP model to mammography, which presents significant challenges due to labeled data scarcity, high-resolution images with small regions of interest, and data imbalance. We first develop a specialized supervision framework for mammography that leverages its multi-view nature. Furthermore, we design a symmetric local alignment module to better focus on detailed features in high-resolution images. Lastly, we incorporate a parameter-efficient fine-tuning approach for large language models pre-trained with medical knowledge to address data limitations. Our multi-view and multi-scale alignment (MaMA) method outperforms state-of-the-art baselines for three different tasks on two large real-world mammography datasets, EMBED and RSNA-Mammo, with only 52% model size compared with the largest baseline.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Virginia > Fairfax County > Reston (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Beyond Size and Class Balance: Alpha as a New Dataset Quality Metric for Deep Learning
Couch, Josiah, Arnaout, Rima, Arnaout, Ramy
In deep learning, achieving high performance on image classification tasks requires diverse training sets. However, the current best practice$\unicode{x2013}$maximizing dataset size and class balance$\unicode{x2013}$does not guarantee dataset diversity. We hypothesized that, for a given model architecture, model performance can be improved by maximizing diversity more directly. To test this hypothesis, we introduce a comprehensive framework of diversity measures from ecology that generalizes familiar quantities like Shannon entropy by accounting for similarities among images. (Size and class balance emerge as special cases.) Analyzing thousands of subsets from seven medical datasets showed that the best correlates of performance were not size or class balance but $A$$\unicode{x2013}$"big alpha"$\unicode{x2013}$a set of generalized entropy measures interpreted as the effective number of image-class pairs in the dataset, after accounting for image similarities. One of these, $A_0$, explained 67% of the variance in balanced accuracy, vs. 54% for class balance and just 39% for size. The best pair of measures was size-plus-$A_1$ (79%), which outperformed size-plus-class-balance (74%). Subsets with the largest $A_0$ performed up to 16% better than those with the largest size (median improvement, 8%). We propose maximizing $A$ as a way to improve deep learning performance in medical imaging.
- North America > United States > California > San Francisco County > San Francisco (0.46)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Middle East > Israel (0.04)
- (4 more...)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Immunology (0.93)
Evaluating Fair Feature Selection in Machine Learning for Healthcare
Zawad, Md Rahat Shahriar, Washington, Peter
With the universal adoption of machine learning in healthcare, the potential for the automation of societal biases to further exacerbate health disparities poses a significant risk. We explore algorithmic fairness from the perspective of feature selection. Traditional feature selection methods identify features for better decision making by removing resource-intensive, correlated, or non-relevant features but overlook how these factors may differ across subgroups. To counter these issues, we evaluate a fair feature selection method that considers equal importance to all demographic groups. We jointly considered a fairness metric and an error metric within the feature selection process to ensure a balance between minimizing both bias and global classification error. We tested our approach on three publicly available healthcare datasets. On all three datasets, we observed improvements in fairness metrics coupled with a minimal degradation of balanced accuracy. Our approach addresses both distributive and procedural fairness within the fair machine learning context.
- North America > United States > Hawaii > Honolulu County > Honolulu (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Rhode Island (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.46)