Although the potential for artificial intelligence to transform healthcare in lower income countries has been much hyped, the technology is proving genuinely useful in helping Africa overcome difficulties in tackling diseases. Such technology can automate medical tasks and help doctors to do more with limited resources. It can even accelerate advances if certain barriers are overcome. The work of minoHealth AI Labs, the Ghana-based data science start-up that I founded, offers one example. By collecting medical images, we are seeking to automate radiology through the use of deep learning.
This doctoral project will develop the latest computer vision deep learning techniques to better predict outcomes of complex respiratory disease. Through image analysis of chest X-rays and CT scans, we could provide clinicians with more reliable and more objective evidence of specific disease types, and provide greater justification for what are often challenging and high-risk treatments. We currently lack adequate tools for diagnosing complex respiratory infection. Using existing methods, it can be difficult to know whether newly-identified bacteria have caused the severe lung disease, or whether they simply reflect the severe disease, acting as so-called'by-standers' having thrived in this environment. This is a problem, as (i) some infections such as those caused by Pseudomonas and Mycobacteria may require intravenous and potentially long and toxic therapy; and (ii) antibiotic usage when it is not required for treatment drives increasing drug-resistance and this remains a global emergency.
A decade of unprecedented progress in artificial intelligence (AI) has demonstrated the potential for many fields—including medicine—to benefit from the insights that AI techniques can extract from data. Here we survey recent progress in the development of modern computer vision techniques—powered by deep learning—for medical applications, focusing on medical imaging, medical video, and clinical deployment. We start by briefly summarizing a decade of progress in convolutional neural networks, including the vision tasks they enable, in the context of healthcare. Next, we discuss several example medical imaging applications that stand to benefit—including cardiology, pathology, dermatology, ophthalmology–and propose new avenues for continued work. We then expand into general medical video, highlighting ways in which clinical workflows can integrate computer vision to enhance care. Finally, we discuss the challenges and hurdles required for real-world clinical deployment of these technologies.
To automate skeletal muscle segmentation in a pediatric population using convolutional neural networks that identify and segment the L3 level at CT. In this retrospective study, two sets of U-Net–based models were developed to identify the L3 level in the sagittal plane and segment the skeletal muscle from the corresponding axial image. For model development, 370 patients (sampled uniformly across age group from 0 to 18 years and including both sexes) were selected between January 2009 and January 2019, and ground truth L3 location and skeletal muscle segmentation were manually defined. Twenty percent (74 of 370) of the examinations were reserved for testing the L3 locator and muscle segmentation, while the remaining were used for training. For the L3 locator models, maximum intensity projections (MIPs) from a fixed number of central sections of sagittal reformats (either 12 or 18 sections) were used as input with or without transfer learning using an L3 localizer trained on an external dataset (four models total).
The success of modern deep learning algorithms for image segmentation heavily depends on the availability of large datasets with clean pixel-level annotations (masks), where the objects of interest are accurately delineated. Lack of time and expertise during data annotation leads to incorrect boundaries and label noise. It is known that deep convolutional neural networks (DCNNs) can memorize even completely random labels, resulting in poor accuracy. We propose a framework to train binary segmentation DCNNs using sets of unreliable pixel-level annotations. Erroneously labeled pixels are identified based on the estimated aleatoric uncertainty of the segmentation and are relabeled to the true value.
Computer-aided detection (CAD) of benign and malignant breast lesions becomes increasingly essential in breast ultrasound (US) imaging. The CAD systems rely on imaging features identified by the medical experts for their performance, whereas deep learning (DL) methods automatically extract features from the data. The challenge of the DL is the insufficiency of breast US images available to train the DL models. Here, we present an ensemble transfer learning model to classify benign and malignant breast tumors using B-mode breast US (B-US) and strain elastography breast US (SE-US) images. This model combines semantic features from AlexNet & ResNet models to classify benign from malignant tumors. We use both B-US and SE-US images to train the model and classify the tumors. We retrospectively gathered 85 patients' data, with 42 benign and 43 malignant cases confirmed with the biopsy. Each patient had multiple B-US and their corresponding SE-US images, and the total dataset contained 261 B-US images and 261 SE-US images. Experimental results show that our ensemble model achieves a sensitivity of 88.89% and specificity of 91.10%. These diagnostic performances of the proposed method are equivalent to or better than manual identification. Thus, our proposed ensemble learning method would facilitate detecting early breast cancer, reliably improving patient care.
COVID-19 classification using chest Computed Tomography (CT) has been found pragmatically useful by several studies. Due to the lack of annotated samples, these studies recommend transfer learning and explore the choices of pre-trained models and data augmentation. However, it is still unknown if there are better strategies than vanilla transfer learning for more accurate COVID-19 classification with limited CT data. This paper provides an affirmative answer, devising a novel `model' augmentation technique that allows a considerable performance boost to transfer learning for the task. Our method systematically reduces the distributional shift between the source and target domains and considers augmenting deep learning with complementary representation learning techniques. We establish the efficacy of our method with publicly available datasets and models, along with identifying contrasting observations in the previous studies.
In recent years, physiological signal based authentication has shown great promises,for its inherent robustness against forgery. Electrocardiogram (ECG) signal, being the most widely studied biosignal, has also received the highest level of attention in this regard. It has been proven with numerous studies that by analyzing ECG signals from different persons, it is possible to identify them, with acceptable accuracy. In this work, we present, EDITH, a deep learning-based framework for ECG biometrics authentication system. Moreover, we hypothesize and demonstrate that Siamese architectures can be used over typical distance metrics for improved performance. We have evaluated EDITH using 4 commonly used datasets and outperformed the prior works using less number of beats. EDITH performs competitively using just a single heartbeat (96-99.75% accuracy) and can be further enhanced by fusing multiple beats (100% accuracy from 3 to 6 beats). Furthermore, the proposed Siamese architecture manages to reduce the identity verification Equal Error Rate (EER) to 1.29%. A limited case study of EDITH with real-world experimental data also suggests its potential as a practical authentication system.
Joint damage in Rheumatoid Arthritis (RA) is assessed by manually inspecting and grading radiographs of hands and feet. This is a tedious task which requires trained experts whose subjective assessment leads to low inter-rater agreement. An algorithm which can automatically predict the joint level damage in hands and feet can help optimize this process, which will eventually aid the doctors in better patient care and research. In this paper, we propose a two-staged approach which amalgamates object detection and convolution neural networks with attention which can efficiently and accurately predict the overall and joint level narrowing and erosion from patients radiographs. This approach has been evaluated on hands and feet radiographs of patients suffering from RA and has achieved a weighted root mean squared error (RMSE) of 1.358 and 1.404 in predicting joint level narrowing and erosion Sharp van der Heijde (SvH) scores which is 31% and 19% improvement with respect to the baseline SvH scores, respectively. The proposed approach achieved a weighted absolute error of 1.456 in predicting the overall damage in hands and feet radiographs for the patients which is a 79% improvement as compared to the baseline. Our method also provides an inherent capability to provide explanations for model predictions using attention weights, which is essential given the black box nature of deep learning models. The proposed approach was developed during the RA2 Dream Challenge hosted by Dream Challenges and secured 4th and 8th position in predicting overall and joint level narrowing and erosion SvH scores from radiographs.
Bone age assessment is challenging in clinical practice due to the complicated bone age assessment process. Current automatic bone age assessment methods were designed with rare consideration of the diagnostic logistics and thus may yield certain uninterpretable hidden states and outputs. Consequently, doctors can find it hard to cooperate with such models harmoniously because it is difficult to check the correctness of the model predictions. In this work, we propose a new graph-based deep learning framework for bone age assessment with hand radiographs, called Doctor Imitator (DI). The architecture of DI is designed to learn the diagnostic logistics of doctors using the scoring methods (e.g., the Tanner-Whitehouse method) for bone age assessment. Specifically, the convolutions of DI capture the local features of the anatomical regions of interest (ROIs) on hand radiographs and predict the ROI scores by our proposed Anatomy-based Group Convolution, summing up for bone age prediction. Besides, we develop a novel Dual Graph-based Attention module to compute patient-specific attention for ROI features and context attention for ROI scores. As far as we know, DI is the first automatic bone age assessment framework following the scoring methods without fully supervised hand radiographs. Experiments on hand radiographs with only bone age supervision verify that DI can achieve excellent performance with sparse parameters and provide more interpretability.