Frangi, Alejandro F
Integrating Deep Learning with Fundus and Optical Coherence Tomography for Cardiovascular Disease Prediction
Maldonado-Garcia, Cynthia, Zakeri, Arezoo, Frangi, Alejandro F, Ravikumar, Nishant
Early identification of patients at risk of cardiovascular diseases (CVD) is crucial for effective preventive care, reducing healthcare burden, and improving patients' quality of life. This study demonstrates the potential of retinal optical coherence tomography (OCT) imaging combined with fundus photographs for identifying future adverse cardiac events. We used data from 977 patients who experienced CVD within a 5-year interval post-image acquisition, alongside 1,877 control participants without CVD, totaling 2,854 subjects. We propose a novel binary classification network based on a Multi-channel Variational Autoencoder (MCVAE), which learns a latent embedding of patients' fundus and OCT images to classify individuals into two groups: those likely to develop CVD in the future and those who are not. Our model, trained on both imaging modalities, achieved promising results (AUROC 0.78 +/- 0.02, accuracy 0.68 +/- 0.002, precision 0.74 +/- 0.02, sensitivity 0.73 +/- 0.02, and specificity 0.68 +/- 0.01), demonstrating its efficacy in identifying patients at risk of future CVD events based on their retinal images. This study highlights the potential of retinal OCT imaging and fundus photographs as cost-effective, non-invasive alternatives for predicting cardiovascular disease risk. The widespread availability of these imaging techniques in optometry practices and hospitals further enhances their potential for large-scale CVD risk screening. Our findings contribute to the development of standardized, accessible methods for early CVD risk identification, potentially improving preventive care strategies and patient outcomes.
Predicting risk of cardiovascular disease using retinal OCT imaging
Maldonado-Garcia, Cynthia, Bonazzola, Rodrigo, Ferrante, Enzo, Julian, Thomas H, Sergouniotis, Panagiotis I, Ravikumara, Nishant, Frangi, Alejandro F
Purpose: we investigated the potential of optical coherence tomography (OCT) as an additional imaging technique to predict future cardiovascular disease (CVD). Design: Retrospective cohort study Participants: We employed retinal optical coherence tomography (OCT) imaging data obtained from the UK Biobank. Data for 630 patients who suffered acute myocardial infarction (MI) or stroke within a 5-year interval after image acquisition, together with an equal number of participants without CVD (control group), were used to train our model (1260 subjects in total). Methods: We utilised a self-supervised deep learning approach based on Variational Autoencoders (VAE) to learn low-dimensional (latent) representations of high-dimensional 3D OCT images and to capture distinct characteristics of different retinal layers within the OCT image. A Random Forest (RF) classifier was subsequently trained using the learned latent features and participant demographic and clinical data, to differentiate between patients at risk of CVD events (MI or stroke) and non-CVD cases. Main Outcome Measures: Our predictive model, trained on multimodal data, was assessed based on its ability to correctly identify individuals likely to suffer from a CVD event (MI or stroke), within a 5-year interval after image acquisition. Results: Our self-supervised VAE feature selection and multimodal Random Forest classifier differentiate between patients at risk of future CVD events and the control group with an AUC of 0.75, outperforming the clinically established QRISK3 score (AUC = 0.597). The choroidal layer visible in OCT images was identified as an important predictor of future CVD events using a novel approach to model explanability. Conclusions: Retinal OCT imaging provides a cost-effective and non-invasive alternative to predict the risk of cardiovascular disease and is readily accessible in optometry practices and hospitals.
Radiology Report Generation Using Transformers Conditioned with Non-imaging Data
Aksoy, Nurbanu, Ravikumar, Nishant, Frangi, Alejandro F
Medical image interpretation is central to most clinical applications such as disease diagnosis, treatment planning, and prognostication. In clinical practice, radiologists examine medical images and manually compile their findings into reports, which can be a time-consuming process. Automated approaches to radiology report generation, therefore, can reduce radiologist workload and improve efficiency in the clinical pathway. While recent deep-learning approaches for automated report generation from medical images have seen some success, most studies have relied on image-derived features alone, ignoring non-imaging patient data. Although a few studies have included the word-level contexts along with the image, the use of patient demographics is still unexplored. This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information, to synthesise patient-specific radiology reports. The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information, to synthesise full-text radiology reports. Data from two public databases were used to train and evaluate the proposed approach. CXRs and reports were extracted from the MIMIC-CXR database and combined with corresponding patients' data MIMIC-IV. Based on the evaluation metrics used including patient demographic information was found to improve the quality of reports generated using the proposed approach, relative to a baseline network trained using CXRs alone. The proposed approach shows potential for enhancing radiology report generation by leveraging rich patient metadata and combining semantic text embeddings derived thereof, with medical image-derived visual features.
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation
Aksoy, Nurbanu, Sharoff, Serge, Baser, Selcuk, Ravikumar, Nishant, Frangi, Alejandro F
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images. Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists. In this paper, we present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.We introduce a conditioned cross-multi-head attention module to fuse these heterogeneous data modalities, bridging the semantic gap between visual and textual data. Experiments demonstrate substantial improvements from using additional modalities compared to relying on images alone. Notably, our model achieves the highest reported performance on the ROUGE-L metric compared to relevant state-of-the-art models in the literature. Furthermore, we employed both human evaluation and clinical semantic similarity measurement alongside word-overlap metrics to improve the depth of quantitative analysis. A human evaluation, conducted by a board-certified radiologist, confirms the model's accuracy in identifying high-level findings, however, it also highlights that more improvement is needed to capture nuanced details and clinical context.
FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare
Lekadir, Karim, Feragen, Aasa, Fofanah, Abdul Joseph, Frangi, Alejandro F, Buyx, Alena, Emelie, Anais, Lara, Andrea, Porras, Antonio R, Chan, An-Wen, Navarro, Arcadi, Glocker, Ben, Botwe, Benard O, Khanal, Bishesh, Beger, Brigit, Wu, Carol C, Cintas, Celia, Langlotz, Curtis P, Rueckert, Daniel, Mzurikwao, Deogratias, Fotiadis, Dimitrios I, Zhussupov, Doszhan, Ferrante, Enzo, Meijering, Erik, Weicken, Eva, Gonzรกlez, Fabio A, Asselbergs, Folkert W, Prior, Fred, Krestin, Gabriel P, Collins, Gary, Tegenaw, Geletaw S, Kaissis, Georgios, Misuraca, Gianluca, Tsakou, Gianna, Dwivedi, Girish, Kondylakis, Haridimos, Jayakody, Harsha, Woodruf, Henry C, Aerts, Hugo JWL, Walsh, Ian, Chouvarda, Ioanna, Buvat, Irรจne, Rekik, Islem, Duncan, James, Kalpathy-Cramer, Jayashree, Zahir, Jihad, Park, Jinah, Mongan, John, Gichoya, Judy W, Schnabel, Julia A, Kushibar, Kaisar, Riklund, Katrine, Mori, Kensaku, Marias, Kostas, Amugongo, Lameck M, Fromont, Lauren A, Maier-Hein, Lena, Alberich, Leonor Cerdรก, Rittner, Leticia, Phiri, Lighton, Marrakchi-Kacem, Linda, Donoso-Bach, Lluรญs, Martรญ-Bonmatรญ, Luis, Cardoso, M Jorge, Bobowicz, Maciej, Shabani, Mahsa, Tsiknakis, Manolis, Zuluaga, Maria A, Bielikova, Maria, Fritzsche, Marie-Christine, Linguraru, Marius George, Wenzel, Markus, De Bruijne, Marleen, Tolsgaard, Martin G, Ghassemi, Marzyeh, Ashrafuzzaman, Md, Goisauf, Melanie, Yaqub, Mohammad, Ammar, Mohammed, Abadรญa, Mรณnica Cano, Mahmoud, Mukhtar M E, Elattar, Mustafa, Rieke, Nicola, Papanikolaou, Nikolaos, Lazrak, Noussair, Dรญaz, Oliver, Salvado, Olivier, Pujol, Oriol, Sall, Ousmane, Guevara, Pamela, Gordebeke, Peter, Lambin, Philippe, Brown, Pieta, Abolmaesumi, Purang, Dou, Qi, Lu, Qinghua, Osuala, Richard, Nakasi, Rose, Zhou, S Kevin, Napel, Sandy, Colantonio, Sara, Albarqouni, Shadi, Joshi, Smriti, Carter, Stacy, Klein, Stefan, Petersen, Steffen E, Aussรณ, Susanna, Awate, Suyash, Raviv, Tammy Riklin, Cook, Tessa, Mutsvangwa, Tinashe E M, Rogers, Wendy A, Niessen, Wiro J, Puig-Bosch, Xรจnia, Zeng, Yi, Mohammed, Yunusa G, Aquino, Yves Saint James, Salahuddin, Zohaib, Starmans, Martijn P A
Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted by patients, clinicians, health organisations and authorities. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI consortium was founded in 2021 and currently comprises 118 inter-disciplinary experts from 51 countries representing all continents, including AI scientists, clinicians, ethicists, and social scientists. Over a two-year period, the consortium defined guiding principles and best practices for trustworthy AI through an iterative process comprising an in-depth literature review, a modified Delphi survey, and online consensus meetings. The FUTURE-AI framework was established based on 6 guiding principles for trustworthy AI in healthcare, i.e. Fairness, Universality, Traceability, Usability, Robustness and Explainability. Through consensus, a set of 28 best practices were defined, addressing technical, clinical, legal and socio-ethical dimensions. The recommendations cover the entire lifecycle of medical AI, from design, development and validation to regulation, deployment, and monitoring. FUTURE-AI is a risk-informed, assumption-free guideline which provides a structured approach for constructing medical AI tools that will be trusted, deployed and adopted in real-world practice. Researchers are encouraged to take the recommendations into account in proof-of-concept stages to facilitate future translation towards clinical practice of medical AI.
Adaptive Semi-Supervised Segmentation of Brain Vessels with Ambiguous Labels
Lin, Fengming, Xia, Yan, Ravikumar, Nishant, Liu, Qiongyao, MacRaild, Michael, Frangi, Alejandro F
Accurate segmentation of brain vessels is crucial for cerebrovascular disease diagnosis and treatment. However, existing methods face challenges in capturing small vessels and handling datasets that are partially or ambiguously annotated. In this paper, we propose an adaptive semi-supervised approach to address these challenges. Our approach incorporates innovative techniques including progressive semi-supervised learning, adaptative training strategy, and boundary enhancement. Experimental results on 3DRA datasets demonstrate the superiority of our method in terms of mesh-based segmentation metrics. By leveraging the partially and ambiguously labeled data, which only annotates the main vessels, our method achieves impressive segmentation performance on mislabeled fine vessels, showcasing its potential for clinical applications.
Learning disentangled representations for explainable chest X-ray classification using Dirichlet VAEs
Harkness, Rachael, Frangi, Alejandro F, Zucker, Kieran, Ravikumar, Nishant
This study explores the use of the Dirichlet Variational Autoencoder (DirVAE) for learning disentangled latent representations of chest X-ray (CXR) images. Our working hypothesis is that distributional sparsity, as facilitated by the Dirichlet prior, will encourage disentangled feature learning for the complex task of multi-label classification of CXR images. The DirVAE is trained using CXR images from the CheXpert database, and the predictive capacity of multi-modal latent representations learned by DirVAE models is investigated through implementation of an auxiliary multi-label classification task, with a view to enforce separation of latent factors according to class-specific features. The predictive performance and explainability of the latent space learned using the DirVAE were quantitatively and qualitatively assessed, respectively, and compared with a standard Gaussian prior-VAE (GVAE). We introduce a new approach for explainable multi-label classification in which we conduct gradient-guided latent traversals for each class of interest. Study findings indicate that the DirVAE is able to disentangle latent factors into class-specific visual features, a property not afforded by the GVAE, and achieve a marginal increase in predictive performance relative to GVAE. We generate visual examples to show that our explainability method, when applied to the trained DirVAE, is able to highlight regions in CXR images that are clinically relevant to the class(es) of interest and additionally, can identify cases where classification relies on spurious feature correlations.
Unsupervised ensemble-based phenotyping helps enhance the discoverability of genes related to heart morphology
Bonazzola, Rodrigo, Ferrante, Enzo, Ravikumar, Nishant, Xia, Yan, Keavney, Bernard, Plein, Sven, Syeda-Mahmood, Tanveer, Frangi, Alejandro F
Recent genome-wide association studies (GWAS) have been successful in identifying associations between genetic variants and simple cardiac parameters derived from cardiac magnetic resonance (CMR) images. However, the emergence of big databases including genetic data linked to CMR, facilitates investigation of more nuanced patterns of shape variability. Here, we propose a new framework for gene discovery entitled Unsupervised Phenotype Ensembles (UPE). UPE builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner, using deep learning models trained with different hyperparameters. These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations across the ensemble. We apply our approach to the UK Biobank database to extract left-ventricular (LV) geometric features from image-derived three-dimensional meshes. We demonstrate that our approach greatly improves the discoverability of genes influencing LV shape, identifying 11 loci with study-wide significance and 8 with suggestive significance. We argue that our approach would enable more extensive discovery of gene associations with image-derived phenotypes for other organs or image modalities.
Flip Learning: Erase to Segment
Huang, Yuhao, Yang, Xin, Zou, Yuxin, Chen, Chaoyu, Wang, Jian, Dou, Haoran, Ravikumar, Nishant, Frangi, Alejandro F, Zhou, Jianqiao, Ni, Dong
Nodule segmentation from breast ultrasound images is challenging yet essential for the diagnosis. Weakly-supervised segmentation (WSS) can help reduce time-consuming and cumbersome manual annotation. Unlike existing weakly-supervised approaches, in this study, we propose a novel and general WSS framework called Flip Learning, which only needs the box annotation. Specifically, the target in the label box will be erased gradually to flip the classification tag, and the erased region will be considered as the segmentation result finally. Our contribution is three-fold. First, our proposed approach erases on superpixel level using a Multi-agent Reinforcement Learning framework to exploit the prior boundary knowledge and accelerate the learning process. Second, we design two rewards: classification score and intensity distribution reward, to avoid under- and over-segmentation, respectively. Third, we adopt a coarse-to-fine learning strategy to reduce the residual errors and improve the segmentation performance. Extensively validated on a large dataset, our proposed approach achieves competitive performance and shows great potential to narrow the gap between fully-supervised and weakly-supervised learning.