We explore a solution for learning disease signatures from weakly, yet easily obtainable, annotated volumetric medical imaging data by analyzing 3D volumes as a sequence of 2D images. We demonstrate the performance of our solution in the detection of emphysema in lung cancer screening low-dose CT images. Our approach utilizes convolutional long short-term memory (LSTM) to "scan" sequentially through an imaging volume for the presence of disease in a portion of scanned region. This framework allowed effective learning given only volumetric images and binary disease labels, thus enabling training from a large dataset of 6,631 un-annotated image volumes from 4,486 patients. When evaluated in a testing set of 2,163 volumes from 2,163 patients, our model distinguished emphysema with area under the receiver operating characteristic curve (AUC) of .83. This approach was found to outperform 2D convolutional neural networks (CNN) implemented with various multiple-instance learning schemes (AUC=0.69-0.76) and a 3D CNN (AUC=.77).
--Weakly supervised instance labeling using only image-level labels, in lieu of expensive fine-grained pixel annotations, is crucial in several applications including medical image analysis. In contrast to conventional instance segmentation scenarios in computer vision, the problems that we consider are characterized by a small number of training images and non-local patterns that lead to the diagnosis. In this paper, we explore the use of multiple instance learning (MIL) to design an instance label generator under this weakly supervised setting. Motivated by the observation that an MIL model can handle bags of varying sizes, we propose to repurpose an MIL model originally trained for bag-level classification to produce reliable predictions for single instances, i.e., bags of size 1 . T o this end, we introduce a novel regularization strategy based on virtual adversarial training for improving MIL training, and subsequently develop a knowledge distillation technique for repurposing the trained MIL model. Using empirical studies on colon cancer and breast cancer detection from histopathological images, we show that the proposed approach produces high-quality instance-level prediction and significantly outperforms state-of-the MIL methods.
Computed tomography (CT) generates a stack of cross-sectional images covering a region of the body. The visual assessment of these images for the identification of potential abnormalities is a challenging and time consuming task due to the large amount of information that needs to be processed. In this article we propose a deep artificial neural network architecture, ReCTnet, for the fully-automated detection of pulmonary nodules in CT scans. The architecture learns to distinguish nodules and normal structures at the pixel level and generates three-dimensional probability maps highlighting areas that are likely to harbour the objects of interest. Convolutional and recurrent layers are combined to learn expressive image representations exploiting the spatial dependencies across axial slices. We demonstrate that leveraging intra-slice dependencies substantially increases the sensitivity to detect pulmonary nodules without inflating the false positive rate. On the publicly available LIDC/IDRI dataset consisting of 1,018 annotated CT scans, ReCTnet reaches a detection sensitivity of 90.5% with an average of 4.5 false positives per scan. Comparisons with a competing multi-channel convolutional neural network for multi-slice segmentation and other published methodologies using the same dataset provide evidence that ReCTnet offers significant performance gains.
While deep learning methods are increasingly being applied to tasks such as computer-aided diagnosis, these models are difficult to interpret, do not incorporate prior domain knowledge, and are often considered as a "black-box." The lack of model interpretability hinders them from being fully understood by target users such as radiologists. In this paper, we present a novel interpretable deep hierarchical semantic convolutional neural network (HSCNN) to predict whether a given pulmonary nodule observed on a computed tomography (CT) scan is malignant. Our network provides two levels of output: 1) low-level radiologist semantic features, and 2) a high-level malignancy prediction score. The low-level semantic outputs quantify the diagnostic features used by radiologists and serve to explain how the model interprets the images in an expert-driven manner. The information from these low-level tasks, along with the representations learned by the convolutional layers, are then combined and used to infer the high-level task of predicting nodule malignancy. This unified architecture is trained by optimizing a global loss function including both low- and high-level tasks, thereby learning all the parameters within a joint framework. Our experimental results using the Lung Image Database Consortium (LIDC) show that the proposed method not only produces interpretable lung cancer predictions but also achieves significantly better results compared to common 3D CNN approaches.
Pulmonary lobe segmentation is an important task for pulmonary disease related Computer Aided Diagnosis systems (CADs). Classical methods for lobe segmentation rely on successful detection of fissures and other anatomical information such as the location of blood vessels and airways. With the success of deep learning in recent years, Deep Convolutional Neural Network (DCNN) has been widely applied to analyze medical images like Computed Tomography (CT) and Magnetic Resonance Imaging (MRI), which, however, requires a large number of ground truth annotations. In this work, we release our manually labeled 50 CT scans which are randomly chosen from the LUNA16 dataset and explore the use of deep learning on this task. We propose pre-processing CT image by cropping region that is covered by the convex hull of the lungs in order to mitigate the influence of noise from outside the lungs. Moreover, we design a hybrid loss function with dice loss to tackle extreme class imbalance issue and focal loss to force model to focus on voxels that are hard to be discriminated. To validate the robustness and performance of our proposed framework trained with a small number of training examples, we further tested our model on CT scans from an independent dataset. Experimental results show the robustness of the proposed approach, which consistently improves performance across different datasets by a maximum of $5.87\%$ as compared to a baseline model.