Goto

Collaborating Authors

 angiography


Integrated Pipeline for Coronary Angiography With Automated Lesion Profiling, Virtual Stenting, and 100-Vessel FFR Validation

Kopanitsa, Georgy, Metsker, Oleg, Yakovlev, Alexey

arXiv.org Artificial Intelligence

Coronary angiography is the main tool for assessing coronary artery disease, but visual grading of stenosis is variable and only moderately related to ischaemia. Wire based fractional flow reserve (FFR) improves lesion selection but is not used systematically. Angiography derived indices such as quantitative flow ratio (QFR) offer wire free physiology, yet many tools are workflow intensive and separate from automated anatomy analysis and virtual PCI planning. We developed AngioAI-QFR, an end to end angiography only pipeline combining deep learning stenosis detection, lumen segmentation, centreline and diameter extraction, per millimetre Relative Flow Capacity profiling, and virtual stenting with automatic recomputation of angiography derived QFR. The system was evaluated in 100 consecutive vessels with invasive FFR as reference. Primary endpoints were agreement with FFR (correlation, mean absolute error) and diagnostic performance for FFR <= 0.80. On held out frames, stenosis detection achieved precision 0.97 and lumen segmentation Dice 0.78. Across 100 vessels, AngioAI-QFR correlated strongly with FFR (r = 0.89, MAE 0.045). The AUC for detecting FFR <= 0.80 was 0.93, with sensitivity 0.88 and specificity 0.86. The pipeline completed fully automatically in 93 percent of vessels, with median time to result 41 s. RFC profiling distinguished focal from diffuse capacity loss, and virtual stenting predicted larger QFR gain in focal than in diffuse disease. AngioAI-QFR provides a practical, near real time pipeline that unifies computer vision, functional profiling, and virtual PCI with automated angiography derived physiology.


TREAT-Net: Tabular-Referenced Echocardiography Analysis for Acute Coronary Syndrome Treatment Prediction

Kim, Diane, To, Minh Nguyen Nhat, Abdalla, Sherif, Tsang, Teresa S. M., Abolmaesumi, Purang, Luong, and Christina

arXiv.org Artificial Intelligence

Coronary angiography remains the gold standard for diagnosing Acute Coronary Syndrome (ACS). However, its resource-intensive and invasive nature can expose patients to procedural risks and diagnostic delays, leading to postponed treatment initiation. In this work, we introduce TREAT-Net, a multimodal deep learning framework for ACS treatment prediction that leverages non-invasive modalities, including echocardiography videos and structured clinical records. TREAT-Net integrates tabular-guided cross-attention to enhance video interpretation, along with a late fusion mechanism to align predictions across modalities. Trained on a dataset of over 9,000 ACS cases, the model outperforms unimodal and non-fused baselines, achieving a balanced accuracy of 67.6% and an AUROC of 71.1%. Cross-modality agreement analysis demonstrates 88.6% accuracy for intervention prediction.


Peptidomic-Based Prediction Model for Coronary Heart Disease Using a Multilayer Perceptron Neural Network

Celis-Porras, Jesus

arXiv.org Artificial Intelligence

Coronary heart disease (CHD) is a leading cause of death worldwide and contributes significantly to annual healthcare expenditures. To develop a non-invasive diagnostic approach, we designed a model based on a multilayer perceptron (MLP) neural network, trained on 50 key urinary peptide biomarkers selected via genetic algorithms. Treatment and control groups, each comprising 345 individuals, were balanced using the Synthetic Minority Over-sampling Technique (SMOTE). The neural network was trained using a stratified validation strategy. Using a network with three hidden layers of 60 neurons each and an output layer of two neurons, the model achieved a precision, sensitivity, and specificity of 95.67 percent, with an F1-score of 0.9565. The area under the ROC curve (AUC) reached 0.9748 for both classes, while the Matthews correlation coefficient (MCC) and Cohen's kappa coefficient were 0.9134 and 0.9131, respectively, demonstrating its reliability in detecting CHD. These results indicate that the model provides a highly accurate and robust non-invasive diagnostic tool for coronary heart disease.


A Survey of Multimodal Ophthalmic Diagnostics: From Task-Specific Approaches to Foundational Models

Luo, Xiaoling, Zheng, Ruli, Zheng, Qiaojian, Du, Zibo, Yang, Shuo, Ding, Meidan, Xu, Qihao, Liu, Chengliang, Shen, Linlin

arXiv.org Artificial Intelligence

Visual impairment represents a major global health challenge, with multimodal imaging providing complementary information that is essential for accurate ophthalmic diagnosis. This comprehensive survey systematically reviews the latest advances in multimodal deep learning methods in ophthalmology up to the year 2025. The review focuses on two main categories: task-specific multimodal approaches and large-scale multimodal foundation models. Task-specific approaches are designed for particular clinical applications such as lesion detection, disease diagnosis, and image synthesis. These methods utilize a variety of imaging modalities including color fundus photography, optical coherence tomography, and angiography. On the other hand, foundation models combine sophisticated vision-language architectures and large language models pretrained on diverse ophthalmic datasets. These models enable robust cross-modal understanding, automated clinical report generation, and decision support. The survey critically examines important datasets, evaluation metrics, and methodological innovations including self-supervised learning, attention-based fusion, and contrastive alignment. It also discusses ongoing challenges such as variability in data, limited annotations, lack of interpretability, and issues with generalizability across different patient populations. Finally, the survey outlines promising future directions that emphasize the use of ultra-widefield imaging and reinforcement learning-based reasoning frameworks to create intelligent, interpretable, and clinically applicable AI systems for ophthalmology.


Weakly Supervised Intracranial Aneurysm Detection and Segmentation in MR angiography via Multi-task UNet with Vesselness Prior

Rainville, Erin, Rasoulian, Amirhossein, Rivaz, Hassan, Xiao, Yiming

arXiv.org Artificial Intelligence

Intracranial aneurysms (IAs) are abnormal dilations of cerebral blood vessels that, if ruptured, can lead to life-threatening consequences. However, their small size and soft contrast in radiological scans often make it difficult to perform accurate and efficient detection and morphological analyses, which are critical in the clinical care of the disorder . Furthermore, the lack of large public datasets with voxel-wise expert annotations pose challenges for developing deep learning algorithms to address the issues. Therefore, we proposed a novel weakly supervised 3D multi-task UNet that integrates vesselness priors to jointly perform aneurysm detection and segmentation in time-of-flight MR angiography (TOF-MRA). Specifically, to robustly guide IA detection and segmentation, we employ the popular Frangi's vesselness filter to derive soft cerebrovascular priors for both network input and an attention block to conduct segmentation from the decoder and detection from an auxiliary branch. W e train our model on the Lausanne dataset with coarse ground truth segmentation, and evaluate it on the test set with refined labels from the same database. T o further assess our model's generalizability, we also validate it externally on the ADAM dataset. Our results demonstrate the superior performance of the proposed technique over the SOTA techniques for aneurysm segmentation (Dice = 0.614, 95%HD =1.38mm) and detection (false positive rate = 1.47, sensitivity = 92.9%).


Benchmarking Large Multimodal Models for Ophthalmic Visual Question Answering with OphthalWeChat

Xu, Pusheng, Gong, Xia, Chen, Xiaolan, Zhang, Weiyi, Yang, Jiancheng, Yan, Bingjie, Yuan, Meng, Zheng, Yalin, He, Mingguang, Shi, Danli

arXiv.org Artificial Intelligence

Purpose: To develop a bilingual multimodal visual question answering (VQA) benchmark for evaluating VLMs in ophthalmology. Methods: Ophthalmic image posts and associated captions published between January 1, 2016, and December 31, 2024, were collected from WeChat Official Accounts. Based on these captions, bilingual question-answer (QA) pairs in Chinese and English were generated using GPT-4o-mini. QA pairs were categorized into six subsets by question type and language: binary (Binary_CN, Binary_EN), single-choice (Single-choice_CN, Single-choice_EN), and open-ended (Open-ended_CN, Open-ended_EN). The benchmark was used to evaluate the performance of three VLMs: GPT-4o, Gemini 2.0 Flash, and Qwen2.5-VL-72B-Instruct. Results: The final OphthalWeChat dataset included 3,469 images and 30,120 QA pairs across 9 ophthalmic subspecialties, 548 conditions, 29 imaging modalities, and 68 modality combinations. Gemini 2.0 Flash achieved the highest overall accuracy (0.548), outperforming GPT-4o (0.522, P < 0.001) and Qwen2.5-VL-72B-Instruct (0.514, P < 0.001). It also led in both Chinese (0.546) and English subsets (0.550). Subset-specific performance showed Gemini 2.0 Flash excelled in Binary_CN (0.687), Single-choice_CN (0.666), and Single-choice_EN (0.646), while GPT-4o ranked highest in Binary_EN (0.717), Open-ended_CN (BLEU-1: 0.301; BERTScore: 0.382), and Open-ended_EN (BLEU-1: 0.183; BERTScore: 0.240). Conclusions: This study presents the first bilingual VQA benchmark for ophthalmology, distinguished by its real-world context and inclusion of multiple examinations per patient. The dataset reflects authentic clinical decision-making scenarios and enables quantitative evaluation of VLMs, supporting the development of accurate, specialized, and trustworthy AI systems for eye care.


Label up: Learning Pulmonary Embolism Segmentation from Image Level Annotation through Model Explainability

Condrea, Florin, Rapaka, Saikiran, Leordeanu, Marius

arXiv.org Artificial Intelligence

Pulmonary Embolisms (PE) are a leading cause of cardiovascular death. Computed tomographic pulmonary angiography (CTPA) stands as the gold standard for diagnosing pulmonary embolisms (PE) and there has been a lot of interest in developing AI-based models for assisting in PE diagnosis. Performance of these algorithms has been hindered by the scarcity of annotated data, especially those with fine-grained delineation of the thromboembolic burden. In this paper we attempt to address this issue by introducing a weakly supervised learning pipeline, that leverages model explainability to generate fine-grained (pixel level) masks for embolisms starting from more coarse-grained (binary, image level) PE annotations. Furthermore, we show that training models using the automatically generated pixel annotations yields good PE localization performance. We demonstrate the effectiveness of our pipeline on the large-scale, multi-center RSPECT augmented dataset for PE detection and localization.


Segmentation of Coronary Artery Stenosis in X-ray Angiography using Mamba Models

Rostami, Ali, Fouladi, Fatemeh, Sajedi, Hedieh

arXiv.org Artificial Intelligence

Coronary artery disease stands as one of the primary contributors to global mortality rates. The automated identification of coronary artery stenosis from X-ray images plays a critical role in the diagnostic process for coronary heart disease. This task is challenging due to the complex structure of coronary arteries, intrinsic noise in X-ray images, and the fact that stenotic coronary arteries appear narrow and blurred in X-ray angiographies. This study employs five different variants of the Mamba-based model and one variant of the Swin Transformer-based model, primarily based on the U-Net architecture, for the localization of stenosis in Coronary artery disease. Our best results showed an F1 score of 68.79% for the U-Mamba BOT model, representing an 11.8% improvement over the semi-supervised approach.


Fundus2Video: Cross-Modal Angiography Video Generation from Static Fundus Photography with Clinical Knowledge Guidance

Zhang, Weiyi, Huang, Siyu, Yang, Jiancheng, Chen, Ruoyu, Ge, Zongyuan, Zheng, Yingfeng, Shi, Danli, He, Mingguang

arXiv.org Artificial Intelligence

Fundus Fluorescein Angiography (FFA) is a critical tool for assessing retinal vascular dynamics and aiding in the diagnosis of eye diseases. However, its invasive nature and less accessibility compared to Color Fundus (CF) images pose significant challenges. Current CF to FFA translation methods are limited to static generation. In this work, we pioneer dynamic FFA video generation from static CF images. We introduce an autoregressive GAN for smooth, memory-saving frame-by-frame FFA synthesis. To enhance the focus on dynamic lesion changes in FFA regions, we design a knowledge mask based on clinical experience. Leveraging this mask, our approach integrates innovative knowledge mask-guided techniques, including knowledge-boosted attention, knowledge-aware discriminators, and mask-enhanced patchNCE loss, aimed at refining generation in critical areas and addressing the pixel misalignment challenge. Our method achieves the best FVD of 1503.21 and PSNR of 11.81 compared to other common video generation approaches. Human assessment by an ophthalmologist confirms its high generation quality. Notably, our knowledge mask surpasses supervised lesion segmentation masks, offering a promising non-invasive alternative to traditional FFA for research and clinical applications. The code is available at https://github.com/Michi-3000/Fundus2Video.


EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging

Shi, Danli, Zhang, Weiyi, Chen, Xiaolan, Liu, Yexin, Yang, Jiancheng, Huang, Siyu, Tham, Yih Chung, Zheng, Yingfeng, He, Mingguang

arXiv.org Artificial Intelligence

Artificial intelligence (AI) is vital in ophthalmology, tackling tasks like diagnosis, classification, and visual question answering (VQA). However, existing AI models in this domain often require extensive annotation and are task-specific, limiting their clinical utility. While recent developments have brought about foundation models for ophthalmology, they are limited by the need to train separate weights for each imaging modality, preventing a comprehensive representation of multi-modal features. This highlights the need for versatile foundation models capable of handling various tasks and modalities in ophthalmology. To address this gap, we present EyeFound, a multimodal foundation model for ophthalmic images. Unlike existing models, EyeFound learns generalizable representations from unlabeled multimodal retinal images, enabling efficient model adaptation across multiple applications. Trained on 2.78 million images from 227 hospitals across 11 ophthalmic modalities, EyeFound facilitates generalist representations and diverse multimodal downstream tasks, even for detecting challenging rare diseases. It outperforms previous work RETFound in diagnosing eye diseases, predicting systemic disease incidents, and zero-shot multimodal VQA. EyeFound provides a generalizable solution to improve model performance and lessen the annotation burden on experts, facilitating widespread clinical AI applications for retinal imaging.