Not enough data to create a plot.
Try a different view from the menu above.
Peng, Yifan
Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment
Ji, Zhanghexuan, Shaikh, Mohammad Abuzar, Moukheiber, Dana, Srihari, Sargur, Peng, Yifan, Gao, Mingchen
Self-supervised learning provides an opportunity to explore unlabeled chest X-rays and their associated free-text reports accumulated in clinical routine without manual supervision. This paper proposes a Joint Image Text Representation Learning Network (JoImTeRNet) for pre-training on chest X-ray images and their radiology reports. The model was pre-trained on both the global image-sentence level and the local image region-word level for visual-textual matching. Both are bidirectionally constrained on Cross-Entropy based and ranking-based Triplet Matching Losses. The region-word matching is calculated using the attention mechanism without direct supervision about their mapping. The pre-trained multi-modal representation learning paves the way for downstream tasks concerning image and/or text encoding. We demonstrate the representation learning quality by cross-modality retrievals and multilabel classifications on two datasets: OpenI-IU and MIMIC-CXR.
Multi-modal, multi-task, multi-attention (M3) deep learning detection of reticular pseudodrusen: towards automated and accessible classification of age-related macular degeneration
Chen, Qingyu, Keenan, Tiarnan D. L., Allot, Alexis, Peng, Yifan, Agrón, Elvira, Domalpally, Amitha, Klaver, Caroline C. W., Luttikhuizen, Daniel T., Colyer, Marcus H., Cukras, Catherine A., Wiley, Henry E., Magone, M. Teresa, Cousineau-Krieger, Chantal, Wong, Wai T., Zhu, Yingying, Chew, Emily Y., Lu, Zhiyong
Objective Reticular pseudodrusen (RPD), a key feature of age-related macular degeneration (AMD), are poorly detected by human experts on standard color fundus photography (CFP) and typically require advanced imaging modalities such as fundus autofluorescence (FAF). The objective was to develop and evaluate the performance of a novel'M3' deep learning framework on RPD detection. Materials and Methods A deep learning framework M3 was developed to detect RPD presence accurately using CFP alone, FAF alone, or both, employing 8000 CFP-FAF image pairs obtained prospectively (Age-Related Eye Disease Study 2). The M3 framework includes multi-modal (detection from single or multiple image modalities), multi-task (training different tasks simultaneously to improve generalizability), and multi-attention (improving ensembled feature representation) operation. Performance on RPD detection was compared with state-of-the-art deep learning models and 13 ophthalmologists; performance on detection of two other AMD features (geographic atrophy and pigmentary abnormalities) was also evaluated. Results For RPD detection, M3 achieved area under receiver operating characteristic (AUROC) 0.832, 0.931, and 0.933 for CFP alone, FAF alone, and both, respectively. M3 performance on CFP was very substantially superior to human retinal specialists (median F1-score 0.644 versus 0.350). External validation (on Rotterdam Study, Netherlands) demonstrated high accuracy on CFP alone (AUROC 0.965). The M3 framework also accurately detected geographic atrophy and pigmentary abnormalities (AUROC 0.909 and 0.912, respectively), demonstrating its generalizability. Conclusion This study demonstrates the successful development, robust evaluation, and external validation of a novel deep learning framework that enables accessible, accurate, and automated AMD diagnosis and prognosis. INTRODUCTION Age-related macular degeneration (AMD) is the leading cause of legal blindness in developed countries [1 2]. Late AMD is the stage with the potential for severe visual loss; it takes two forms, geographic atrophy and neovascular AMD. AMD is traditionally diagnosed and classified using color fundus photography (CFP) [3], the most widely used and accessible imaging modality in ophthalmology. In the absence of late disease, two main features (macular drusen and pigmentary abnormalities) are used to classify disease and stratify risk of progression to late AMD [3]. More recently, additional imaging modalities have become available in specialist centers, particularly fundus autofluorescence (FAF) imaging [4 5]. Following these developments in retinal imaging, a third macular feature (reticular pseudodrusen, RPD) is now recognized as a key AMD lesion [6 7].
Efficient Long-Range Convolutions for Point Clouds
Peng, Yifan, Lin, Lin, Ying, Lexing, Zepeda-Núñez, Leonardo
The efficient treatment of long-range interactions for point clouds is a challenging problem in many scientific machine learning applications. To extract global information, one usually needs a large window size, a large number of layers, and/or a large number of channels. This can often significantly increase the computational cost. In this work, we present a novel neural network layer that directly incorporates long-range information for a point cloud. This layer, dubbed the long-range convolutional (LRC)-layer, leverages the convolutional theorem coupled with the non-uniform Fourier transform. In a nutshell, the LRC-layer mollifies the point cloud to an adequately sized regular grid, computes its Fourier transform, multiplies the result by a set of trainable Fourier multipliers, computes the inverse Fourier transform, and finally interpolates the result back to the point cloud. The resulting global all-to-all convolution operation can be performed in nearly-linear time asymptotically with respect to the number of input points. The LRC-layer is a particularly powerful tool when combined with local convolution as together they offer efficient and seamless treatment of both short and long range interactions. We showcase this framework by introducing a neural network architecture that combines LRC-layers with short-range convolutional layers to accurately learn the energy and force associated with a $N$-body potential. We also exploit the induced two-level decomposition and propose an efficient strategy to train the combined architecture with a reduced number of samples.
ML-Net: multi-label classification of biomedical texts with deep neural networks
Du, Jingcheng, Chen, Qingyu, Peng, Yifan, Xiang, Yang, Tao, Cui, Lu, Zhiyong
In multi-label text classification, each textual document can be assigned with one or more labels. Due to this nature, the multi-label text classification task is often considered to be more challenging compared to the binary or multi-class text classification problems. As an important task with broad applications in biomedicine such as assigning diagnosis codes, a number of different computational methods (e.g. training and combining binary classifiers for each label) have been proposed in recent years. However, many suffered from modest accuracy and efficiency, with only limited success in practical use. We propose ML-Net, a novel deep learning framework, for multi-label classification of biomedical texts. As an end-to-end system, ML-Net combines a label prediction network with an automated label count prediction mechanism to output an optimal set of labels by leveraging both predicted confidence score of each label and the contextual information in the target document. We evaluate ML-Net on three independent, publicly-available corpora in two kinds of text genres: biomedical literature and clinical notes. For evaluation, example-based measures such as precision, recall and f-measure are used. ML-Net is compared with several competitive machine learning baseline models. Our benchmarking results show that ML-Net compares favorably to the state-of-the-art methods in multi-label classification of biomedical texts. ML-NET is also shown to be robust when evaluated on different text genres in biomedicine. Unlike traditional machine learning methods, ML-Net does not require human efforts in feature engineering and is highly efficient and scalable approach to tasks with a large set of labels (no need to build individual classifiers for each separate label). Finally, ML-NET is able to dynamically estimate the label count based on the document context in a more systematic and accurate manner.
BioSentVec: creating sentence embeddings for biomedical texts
Chen, Qingyu, Peng, Yifan, Lu, Zhiyong
Sentence embeddings have become an essential part of today's natural language processing (NLP) systems, especially together advanced deep learning methods. Although pre-trained sentence encoders are available in the general domain, none exists for biomedical texts to date. In this work, we introduce BioSentVec: the first open set of sentence embeddings trained with over 30 million documents from both scholarly articles in PubMed and clinical notes in the MIMIC-III Clinical Database. We evaluate BioSentVec embeddings in two sentence pair similarity tasks in different text genres. Our benchmarking results demonstrate that the BioSentVec embeddings can better capture sentence semantics compared to the other competitive alternatives and achieve state-of-the-art performance in both tasks. We expect BioSentVec to facilitate the research and development in biomedical text mining and to complement the existing resources in biomedical word embeddings.