Plotting

 Tang, Hao


Automatic Pulmonary Lobe Segmentation Using Deep Learning

arXiv.org Artificial Intelligence

Pulmonary lobe segmentation is an important task for pulmonary disease related Computer Aided Diagnosis systems (CADs). Classical methods for lobe segmentation rely on successful detection of fissures and other anatomical information such as the location of blood vessels and airways. With the success of deep learning in recent years, Deep Convolutional Neural Network (DCNN) has been widely applied to analyze medical images like Computed Tomography (CT) and Magnetic Resonance Imaging (MRI), which, however, requires a large number of ground truth annotations. In this work, we release our manually labeled 50 CT scans which are randomly chosen from the LUNA16 dataset and explore the use of deep learning on this task. We propose pre-processing CT image by cropping region that is covered by the convex hull of the lungs in order to mitigate the influence of noise from outside the lungs. Moreover, we design a hybrid loss function with dice loss to tackle extreme class imbalance issue and focal loss to force model to focus on voxels that are hard to be discriminated. To validate the robustness and performance of our proposed framework trained with a small number of training examples, we further tested our model on CT scans from an independent dataset. Experimental results show the robustness of the proposed approach, which consistently improves performance across different datasets by a maximum of $5.87\%$ as compared to a baseline model.


Automated pulmonary nodule detection using 3D deep convolutional neural networks

arXiv.org Artificial Intelligence

Early detection of pulmonary nodules in computed tomography (CT) images is essential for successful outcomes among lung cancer patients. Much attention has been given to deep convolutional neural network (DCNN)-based approaches to this task, but models have relied at least partly on 2D or 2.5D components for inherently 3D data. In this paper, we introduce a novel DCNN approach, consisting of two stages, that is fully three-dimensional end-to-end and utilizes the state-of-the-art in object detection. First, nodule candidates are identified with a U-Net-inspired 3D Faster R-CNN trained using online hard negative mining. Second, false positive reduction is performed by 3D DCNN classifiers trained on difficult examples produced during candidate screening. Finally, we introduce a method to ensemble models from both stages via consensus to give the final predictions. By using this framework, we ranked first of 2887 teams in Season One of Alibaba's 2017 TianChi AI Competition for Healthcare.


An End-to-end Framework For Integrated Pulmonary Nodule Detection and False Positive Reduction

arXiv.org Artificial Intelligence

Pulmonary nodule detection using low-dose Computed Tomography (CT) is often the first step in lung disease screening and diagnosis. Recently, algorithms based on deep convolutional neural nets have shown great promise for automated nodule detection. Most of the existing deep learning nodule detection systems are constructed in two steps: a) nodule candidates screening and b) false positive reduction, using two different models trained separately. Although it is commonly adopted, the two-step approach not only imposes significant resource overhead on training two independent deep learning models, but also is sub-optimal because it prevents cross-talk between the two. In this work, we present an end-to-end framework for nodule detection, integrating nodule candidate screening and false positive reduction into one model, trained jointly. We demonstrate that the end-to-end system improves the performance by 3.88\% over the two-step approach, while at the same time reducing model complexity by one third and cutting inference time by 3.6 fold. Code will be made publicly available.


Improving Dense Crowd Counting Convolutional Neural Networks using Inverse k-Nearest Neighbor Maps and Multiscale Upsampling

arXiv.org Machine Learning

Gatherings of thousands to millions of people occur frequently foran enormous variety of events, and automated counting of these high density crowds is used for safety, management, andmeasuring significance of these events. In this work, we show that the regularly accepted labeling scheme of crowd density maps for training deep neural networks is less effective than our alternative inverse k-nearest neighbor (ikNN) maps, even when used directly in existing state-ofthe-art networkstructures. We also provide a new network architecture MUD-ikNN, which uses multi-scale upsampling via transposed convolutions to take full advantage of the provided ikNN labeling. This upsampling combined with the ikNN maps further outperforms the existing state-of-the-art methods. The full label comparison emphasizes the importance ofthe labeling scheme, with the ikNN labeling being particularly effective. We demonstrate the accuracy of our MUD-ikNN network and the ikNN labeling scheme on a variety of datasets.


Generalizing semi-supervised generative adversarial networks to regression

arXiv.org Machine Learning

In this work, we generalize semi-supervised generative adversarial networks (GANs) from classification problems to regression problems. In the last few years, the importance of improving the training of neural networks using semi-supervised training has been demonstrated for classification problems. With probabilistic classification being a subset of regression problems, this generalization opens up many new possibilities for the use of semi-supervised GANs as well as presenting an avenue for a deeper understanding of how they function. We first demonstrate the capabilities of semi-supervised regression GANs on a toy dataset which allows for a detailed understanding of how they operate in various circumstances. This toy dataset is used to provide a theoretical basis of the semi-supervised regression GAN. We then apply the semi-supervised regression GANs to the real-world application of age estimation from single images. We perform extensive tests of what accuracies can be achieved with significantly reduced annotated data. Through the combination of the theoretical example and real-world scenario, we demonstrate how semi-supervised GANs can be generalized to regression problems.


End-to-End Training Approaches for Discriminative Segmental Models

arXiv.org Machine Learning

Recent work on discriminative segmental models has shown that they can achieve competitive speech recognition performance, using features based on deep neural frame classifiers. However, segmental models can be more challenging to train than standard frame-based approaches. While some segmental models have been successfully trained end to end, there is a lack of understanding of their training under different settings and with different losses. We investigate a model class based on recent successful approaches, consisting of a linear model that combines segmental features based on an LSTM frame classifier. Similarly to hybrid HMM-neural network models, segmental models of this class can be trained in two stages (frame classifier training followed by linear segmental model weight training), end to end (joint training of both frame classifier and linear weights), or with end-to-end fine-tuning after two-stage training. We study segmental models trained end to end with hinge loss, log loss, latent hinge loss, and marginal log loss. We consider several losses for the case where training alignments are available as well as where they are not. We find that in general, marginal log loss provides the most consistent strong performance without requiring ground-truth alignments. We also find that training with dropout is very important in obtaining good performance with end-to-end training. Finally, the best results are typically obtained by a combination of two-stage training and fine-tuning.