Just as asking a single person about their health will provide tailored, personalized information impossible to glean from a large poll, an individual cell's genome or transcriptome can provide much more information about their place in living systems than sequencing a whole batch of cells. But until recent years, the technology didn't exist to get that high resolution genomic data--and until today, there wasn't a reliable way to ensure the high quality and usefulness of that data. Researchers from the University of North Carolina at Charlotte, led by Dr. Weijun Luo and Dr. Cory Brouwer, have developed an artificial intelligence algorithm to "clean" noisy single-cell RNA sequencing (scRNA-Seq) data. The study, "A Universal Deep Neural Network for In-Depth Cleaning of Single-Cell RNA-Seq Data," was published in Nature Communications on April 7, 2022. From identifying the specific genes associated with sickle cell anemia and breast cancer to creating the mRNA vaccines in the ongoing COVID-19 pandemic, scientists have been searching genomes to unlock the secrets of life since the Human Genome Project of the 1990s.
The use of artificial intelligence (AI) and deep learning (DL) in the medical and healthcare field has been increasing at an astonishing rate. While the Health Insurance Portability and Accountability Act (HIPAA) is important for the protection of personal health information, it presented as the biggest barrier for gathering large data sets required for deep learning. Several strategies have been successfully implemented to gather lots of data for training medical AI systems without risking patient privacy. AI continues to have a significant impact on medical imaging and deep learning models are constantly being developed to look for anomalies such as bone fractures or possible cancer. The introduction of breast cancer screening has helped to reduce cancer mortality rates in women as well as provide a consistent source of image data.
With the advent of advancements in deep learning approaches, such as deep convolution neural network, residual neural network, adversarial network; U-Net architectures are most widely utilized in biomedical image segmentation to address the automation in identification and detection of the target regions or sub-regions. In recent studies, U-Net based approaches have illustrated state-of-the-art performance in different applications for the development of computer-aided diagnosis systems for early diagnosis and treatment of diseases such as brain tumor, lung cancer, alzheimer, breast cancer, etc., using various modalities. This article contributes in presenting the success of these approaches by describing the U-Net framework, followed by the comprehensive analysis of the U-Net variants by performing (1) inter-modality, and (2) intra-modality categorization to establish better insights into the associated challenges and solutions. Besides, this article also highlights the contribution of U-Net based frameworks in the ongoing pandemic, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) also known as COVID-19. Finally, the strengths and similarities of these U-Net variants are analysed along with the challenges involved in biomedical image segmentation to uncover promising future research directions in this area.
Survival analysis has currently become a hot topic because it has been proven to be useful for understanding the relationships between patients' covariates (e.g. In this study, we study survival analysis of breast cancer patient with gene methylation data and clinical data. We propose a novel method for survival prediction by bidirectional LSTM network and ordinal Cox model. First, gene methylation expression data and clinical data are merged and filtered to keep matching. To reduce the gene expression feature dimension, weight gene co-expression network analysis (WGCNA) algorithm is used to obtain the cluster eigengenes.
Early detection of breast cancer is a powerful tool towards decreasing its socioeconomic burden. Although, artificial intelligence (AI) methods have shown remarkable results towards this goal, their "black box" nature hinders their wide adoption in clinical practice. To address the need for AI guided breast cancer diagnosis, interpretability methods can be utilized. In this study, we used AI methods, i.e., Random Forests (RF), Neural Networks (NN) and Ensembles of Neural Networks (ENN), towards this goal and explained and optimized their performance through interpretability techniques, such as the Global Surrogate (GS) method, the Individual Conditional Expectation (ICE) plots and the Shapley values (SV). The Wisconsin Diagnostic Breast Cancer (WDBC) dataset of the open UCI repository was used for the training and evaluation of the AI algorithms. The best performance for breast cancer diagnosis was achieved by the proposed ENN (96.6% accuracy and 0.96 area under the ROC curve), and its predictions were explained by ICE plots, proving that its decisions were compliant with current medical knowledge and can be further utilized to gain new insights in the pathophysiological mechanisms of breast cancer. Feature selection based on features' importance according to the GS model improved the performance of the RF (leading the accuracy from 96.49% to 97.18% and the area under the ROC curve from 0.96 to 0.97) and feature selection based on features' importance according to SV improved the performance of the NN (leading the accuracy from 94.6% to 95.53% and the area under the ROC curve from 0.94 to 0.95). Compared to other approaches on the same dataset, our proposed models demonstrated state of the art performance while being interpretable.
Computer-aided detection systems based on deep learning have shown great potential in breast cancer detection. However, the lack of domain generalization of artificial neural networks is an important obstacle to their deployment in changing clinical environments. In this work, we explore the domain generalization of deep learning methods for mass detection in digital mammography and analyze in-depth the sources of domain shift in a large-scale multi-center setting. To this end, we compare the performance of eight state-of-the-art detection methods, including Transformer-based models, trained in a single domain and tested in five unseen domains. Moreover, a single-source mass detection training pipeline is designed to improve the domain generalization without requiring images from the new domain. The results show that our workflow generalizes better than state-of-the-art transfer learning-based approaches in four out of five domains while reducing the domain shift caused by the different acquisition protocols and scanner manufacturers. Subsequently, an extensive analysis is performed to identify the covariate shifts with bigger effects on the detection performance, such as due to differences in patient age, breast density, mass size, and mass malignancy. Ultimately, this comprehensive study provides key insights and best practices for future research on domain generalization in deep learning-based breast cancer detection.
This study aims to develop a novel computer-aided diagnosis (CAD) scheme for mammographic breast mass classification using semi-supervised learning. Although supervised deep learning has achieved huge success across various medical image analysis tasks, its success relies on large amounts of high-quality annotations, which can be challenging to acquire in practice. To overcome this limitation, we propose employing a semi-supervised method, i.e., virtual adversarial training (VAT), to leverage and learn useful information underlying in unlabeled data for better classification of breast masses. Accordingly, our VAT-based models have two types of losses, namely supervised and virtual adversarial losses. The former loss acts as in supervised classification, while the latter loss aims at enhancing model robustness against virtual adversarial perturbation, thus improving model generalizability. To evaluate the performance of our VAT-based CAD scheme, we retrospectively assembled a total of 1024 breast mass images, with equal number of benign and malignant masses. A large CNN and a small CNN were used in this investigation, and both were trained with and without the adversarial loss. When the labeled ratios were 40% and 80%, VAT-based CNNs delivered the highest classification accuracy of 0.740 and 0.760, respectively. The experimental results suggest that the VAT-based CAD scheme can effectively utilize meaningful knowledge from unlabeled data to better classify mammographic breast mass images.
Supervised deep learning has shown state-of-the-art performance for medical image segmentation across different applications, including histopathology and cancer research; however, the manual annotation of such data is extremely laborious. In this work, we explore the use of superpixel approaches to compute a pre-segmentation of HER2 stained images for breast cancer diagnosis that facilitates faster manual annotation and correction in a second step. Four methods are compared: Standard Simple Linear Iterative Clustering (SLIC) as a baseline, a domain adapted SLIC, and superpixels based on feature embeddings of a pretrained ResNet-50 and a denoising autoencoder. To tackle oversegmentation, we propose to hierarchically merge superpixels, based on their content in the respective feature space. When evaluating the approaches on fully manually annotated images, we observe that the autoencoder-based superpixels achieve a 23% increase in boundary F1 score compared to the baseline SLIC superpixels. Furthermore, the boundary F1 score increases by 73% when hierarchical clustering is applied on the adapted SLIC and the autoencoder-based superpixels. These evaluations show encouraging first results for a pre-segmentation for efficient manual refinement without the need for an initial set of annotated training data.
Breast cancer is the most commonly occurring cancer in women and the second most common cancer overall. There were over 2.3 million new cases in 2020, making it a significant health problem in the present day. The key challenge in breast cancer detection is to classify tumors as malignant or benign. Malignant refers to cancer cells that can invade and kill nearby tissue and spread to other parts of your body. Unlike cancerous tumors (malignant), Benign does not spread to other parts of the body and is safe somehow.
Instead of using current deep-learning segmentation models (like the UNet and variants), we approach the segmentation problem using trained Convolutional Neural Network (CNN) classifiers, which automatically extract important features from classified targets for image classification. Those extracted features can be visualized and formed heatmaps using Gradient-weighted Class Activation Mapping (Grad-CAM). This study tested whether the heatmaps could be used to segment the classified targets. We also proposed an evaluation method for the heatmaps; that is, to re-train the CNN classifier using images filtered by heatmaps and examine its performance. We used the mean-Dice coefficient to evaluate segmentation results. Results from our experiments show that heatmaps can locate and segment partial tumor areas. But only use of the heatmaps from CNN classifiers may not be an optimal approach for segmentation. In addition, we have verified that the predictions of CNN classifiers mainly depend on tumor areas, and dark regions in Grad-CAM's heatmaps also contribute to classification.