Goto

Collaborating Authors

 Diagnosis


Multiple Instance Learning for Glioma Diagnosis using Hematoxylin and Eosin Whole Slide Images: An Indian Cohort Study

arXiv.org Artificial Intelligence

The effective management of brain tumors relies on precise typing, subtyping, and grading. This study advances patient care with findings from rigorous multiple instance learning experimentations across various feature extractors and aggregators in brain tumor histopathology. It establishes new performance benchmarks in glioma subtype classification across multiple datasets, including a novel dataset focused on the Indian demographic (IPD- Brain), providing a valuable resource for existing research. Using a ResNet-50, pretrained on histopathology datasets for feature extraction, combined with the Double-Tier Feature Distillation (DTFD) feature aggregator, our approach achieves state-of-the-art AUCs of 88.08 on IPD-Brain and 95.81 on the TCGA-Brain dataset, respectively, for three-way glioma subtype classification. Moreover, it establishes new benchmarks in grading and detecting IHC molecular biomarkers (IDH1R132H, TP53, ATRX, Ki-67) through H&E stained whole slide images for the IPD-Brain dataset. The work also highlights a significant correlation between the model decision-making processes and the diagnostic reasoning of pathologists, underscoring its capability to mimic professional diagnostic procedures.


AFBT GAN: enhanced explainability and diagnostic performance for cognitive decline by counterfactual generative adversarial network

arXiv.org Artificial Intelligence

Existing explanation results of functional connectivity (FC) are normally generated by using classification result labels and correlation analysis methods such as Pearson's correlation or gradient backward. However, the diagnostic model is still trained on the black box model and might lack the attention of FCs in important regions during the training. To enhance the explainability and improve diagnostic performance, providing prior knowledge on neurodegeneration-related regions when healthy subjects (HC) develop into subject cognitive decline (SCD) and mild cognitive impairment (MCI) for the diagnostic model is a key step. To better determine the neurodegeneration-related regions, we employ counterfactual reasoning to generate the target label FC matrices derived from source label FC and then subtract source label FC with target label FC. The counterfactual reasoning architecture is constructed by adaptive forward and backward transformer generative adversarial network (AFBT GAN), which is specifically designed by network property in FC and inverse patch embedding operation in the transformer. The specific design can make the model focus more on the current network correlation and employ the global insight of the transformer to reconstruct FC, which both help the generation of high-quality target label FC. The validation experiments are conducted on both clinical and public datasets, the generated attention map are both vital correlated to cognitive function and the diagnostic performance is also significant. The code is available at https://github.com/SXR3015/AFBT-GAN.


Causal Bandits with General Causal Models and Interventions

arXiv.org Machine Learning

This paper considers causal bandits (CBs) for the sequential design of interventions in a causal system. The objective is to optimize a reward function via minimizing a measure of cumulative regret with respect to the best sequence of interventions in hindsight. The paper advances the results on CBs in three directions. First, the structural causal models (SCMs) are assumed to be unknown and drawn arbitrarily from a general class $\mathcal{F}$ of Lipschitz-continuous functions. Existing results are often focused on (generalized) linear SCMs. Second, the interventions are assumed to be generalized soft with any desired level of granularity, resulting in an infinite number of possible interventions. The existing literature, in contrast, generally adopts atomic and hard interventions. Third, we provide general upper and lower bounds on regret. The upper bounds subsume (and improve) known bounds for special cases. The lower bounds are generally hitherto unknown. These bounds are characterized as functions of the (i) graph parameters, (ii) eluder dimension of the space of SCMs, denoted by $\operatorname{dim}(\mathcal{F})$, and (iii) the covering number of the function space, denoted by ${\rm cn}(\mathcal{F})$. Specifically, the cumulative achievable regret over horizon $T$ is $\mathcal{O}(K d^{L-1}\sqrt{T\operatorname{dim}(\mathcal{F}) \log({\rm cn}(\mathcal{F}))})$, where $K$ is related to the Lipschitz constants, $d$ is the graph's maximum in-degree, and $L$ is the length of the longest causal path. The upper bound is further refined for special classes of SCMs (neural network, polynomial, and linear), and their corresponding lower bounds are provided.


StreaMulT: Streaming Multimodal Transformer for Heterogeneous and Arbitrary Long Sequential Data

arXiv.org Artificial Intelligence

The increasing complexity of Industry 4.0 systems brings new challenges regarding predictive maintenance tasks such as fault detection and diagnosis. A corresponding and realistic setting includes multi-source data streams from different modalities, such as sensors measurements time series, machine images, textual maintenance reports, etc. These heterogeneous multimodal streams also differ in their acquisition frequency, may embed temporally unaligned information and can be arbitrarily long, depending on the considered system and task. Whereas multimodal fusion has been largely studied in a static setting, to the best of our knowledge, there exists no previous work considering arbitrarily long multimodal streams alongside with related tasks such as prediction across time. Thus, in this paper, we first formalize this paradigm of heterogeneous multimodal learning in a streaming setting as a new one. To tackle this challenge, we propose StreaMulT, a Streaming Multimodal Transformer relying on cross-modal attention and on a memory bank to process arbitrarily long input sequences at training time and run in a streaming way at inference. StreaMulT improves the state-of-the-art metrics on CMU-MOSEI dataset for Multimodal Sentiment Analysis task, while being able to deal with much longer inputs than other multimodal models. The conducted experiments eventually highlight the importance of the textual embedding layer, questioning recent improvements in Multimodal Sentiment Analysis benchmarks.


Scalable and reliable deep transfer learning for intelligent fault detection via multi-scale neural processes embedded with knowledge

arXiv.org Artificial Intelligence

Deep transfer learning (DTL) is a fundamental method in the field of Intelligent Fault Detection (IFD). It aims to mitigate the degradation of method performance that arises from the discrepancies in data distribution between training set (source domain) and testing set (target domain). Considering the fact that fault data collection is challenging and certain faults are scarce, DTL-based methods face the limitation of available observable data, which reduces the detection performance of the methods in the target domain. Furthermore, DTL-based methods lack comprehensive uncertainty analysis that is essential for building reliable IFD systems. To address the aforementioned problems, this paper proposes a novel DTL-based method known as Neural Processes-based deep transfer learning with graph convolution network (GTNP). Feature-based transfer strategy of GTNP bridges the data distribution discrepancies of source domain and target domain in high-dimensional space. Both the joint modeling based on global and local latent variables and sparse sampling strategy reduce the demand of observable data in the target domain. The multi-scale uncertainty analysis is obtained by using the distribution characteristics of global and local latent variables. Global analysis of uncertainty enables GTNP to provide quantitative values that reflect the complexity of methods and the difficulty of tasks. Local analysis of uncertainty allows GTNP to model uncertainty (confidence of the fault detection result) at each sample affected by noise and bias. The validation of the proposed method is conducted across 3 IFD tasks, consistently showing the superior detection performance of GTNP compared to the other DTL-based methods.


Improving Model's Interpretability and Reliability using Biomarkers

arXiv.org Artificial Intelligence

Accurate and interpretable diagnostic models are crucial in the safety-critical field of medicine. We investigate the interpretability of our proposed biomarker-based lung ultrasound diagnostic pipeline to enhance clinicians' diagnostic capabilities. The objective of this study is to assess whether explanations from a decision tree classifier, utilizing biomarkers, can improve users' ability to identify inaccurate model predictions compared to conventional saliency maps. Our findings demonstrate that decision tree explanations, based on clinically established biomarkers, can assist clinicians in detecting false positives, thus improving the reliability of diagnostic models in medicine.


Intelligent Diagnosis of Alzheimer's Disease Based on Machine Learning

arXiv.org Artificial Intelligence

This study is based on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and aims to explore early detection and disease progression in Alzheimer's disease (AD). We employ innovative data preprocessing strategies, including the use of the random forest algorithm to fill missing data and the handling of outliers and invalid data, thereby fully mining and utilizing these limited data resources. Through Spearman correlation coefficient analysis, we identify some features strongly correlated with AD diagnosis. We build and test three machine learning models using these features: random forest, XGBoost, and support vector machine (SVM). Among them, the XGBoost model performs the best in terms of diagnostic performance, achieving an accuracy of 91%. Overall, this study successfully overcomes the challenge of missing data and provides valuable insights into early detection of Alzheimer's disease, demonstrating its unique research value and practical significance.


An Algorithmic Framework for Constructing Multiple Decision Trees by Evaluating Their Combination Performance Throughout the Construction Process

arXiv.org Artificial Intelligence

Predictions using a combination of decision trees are known to be effective in machine learning. Typical ideas for constructing a combination of decision trees for prediction are bagging and boosting. Bagging independently constructs decision trees without evaluating their combination performance and averages them afterward. Boosting constructs decision trees sequentially, only evaluating a combination performance of a new decision tree and the fixed past decision trees at each step. Therefore, neither method directly constructs nor evaluates a combination of decision trees for the final prediction. When the final prediction is based on a combination of decision trees, it is natural to evaluate the appropriateness of the combination when constructing them. In this study, we propose a new algorithmic framework that constructs decision trees simultaneously and evaluates their combination performance throughout the construction process. Our framework repeats two procedures. In the first procedure, we construct new candidates of combinations of decision trees to find a proper combination of decision trees. In the second procedure, we evaluate each combination performance of decision trees under some criteria and select a better combination. To confirm the performance of the proposed framework, we perform experiments on synthetic and benchmark data.


Boosting-Based Sequential Meta-Tree Ensemble Construction for Improved Decision Trees

arXiv.org Artificial Intelligence

A decision tree is one of the most popular approaches in machine learning fields. However, it suffers from the problem of overfitting caused by overly deepened trees. Then, a meta-tree is recently proposed. It solves the problem of overfitting caused by overly deepened trees. Moreover, the meta-tree guarantees statistical optimality based on Bayes decision theory. Therefore, the meta-tree is expected to perform better than the decision tree. In contrast to a single decision tree, it is known that ensembles of decision trees, which are typically constructed boosting algorithms, are more effective in improving predictive performance. Thus, it is expected that ensembles of meta-trees are more effective in improving predictive performance than a single meta-tree, and there are no previous studies that construct multiple meta-trees in boosting. Therefore, in this study, we propose a method to construct multiple meta-trees using a boosting approach. Through experiments with synthetic and benchmark datasets, we conduct a performance comparison between the proposed methods and the conventional methods using ensembles of decision trees. Furthermore, while ensembles of decision trees can cause overfitting as well as a single decision tree, experiments confirmed that ensembles of meta-trees can prevent overfitting due to the tree depth.


Revealing Multimodal Contrastive Representation Learning through Latent Partial Causal Models

arXiv.org Artificial Intelligence

One promising methods have proven successful across a range strategy in this context is to use data from one modality, e.g., of domains, partly due to their ability to generate text data, as a supervision signal in the interpretation of another, meaningful shared representations of complex e.g., image data (Mori et al., 1999; Wang et al., 2009; phenomena. To enhance the depth of analysis Ramanathan et al., 2013; He & Peng, 2017; Radford et al., and understanding of these acquired representations, 2021). The primary approach for achieving this is known we introduce a unified causal model specifically as multimodal contrastive representation learning, which designed for multimodal data. By examining focuses on optimizing a symmetric contrastive loss (Zhang this model, we show that multimodal contrastive et al., 2022; Radford et al., 2021), e.g., a symmetric adaptation representation learning excels at identifying latent of the standard contrastive loss (Wu et al., 2018; Tian coupled variables within the proposed unified et al., 2020; He et al., 2020; Chen et al., 2020). The learned model, up to linear or permutation transformations representations, guided by the symmetric contrastive loss, resulting from different assumptions. Our have been applied in a variety of applications, including findings illuminate the potential of pre-trained zero/few-shot learning (Radford et al., 2021; Zhou et al., multimodal models, e.g., CLIP, in learning disentangled 2022a), domain generalization (Zhou et al., 2022a;b), and representations through a surprisingly robustness to adversarial examples (Ban & Dong, 2022).