Goto

Collaborating Authors

 ophthalmologist


Why you 'see' things in the dark, according to an ophthalmologist

Popular Science

Why you'see' things in the dark, according to an ophthalmologist Science explains why we see flickers of light and patterns in the darkness. Our eyes sometimes really do play tricks on us at night. Breakthroughs, discoveries, and DIY tips sent every weekday. In 1999, Daniel Myrick and Eduardo Sánchez shot one of the definitive horror films of the era on a budget of roughly $60,000. is a study in omission, in the conspicuous absence of the visual effects characteristic to the genre. In lieu of baroque prosthetic gore and over-the-top CGI effects, the movie leans into silence and darkness for much of its 81-minute run time.


Detection of retinal diseases using an accelerated reused convolutional network

Kasani, Amin Ahmadi, Sajedi, Hedieh

arXiv.org Artificial Intelligence

Convolutional neural networks are constantly being developed, some efforts improve accuracy, some increase speed, and some increase accessibility. Improving accessibility allows the use of neural networks in a wider range of tasks, including the detection of eye diseases. Early diagnosis of eye diseases and visiting an ophthalmologist ca n prevent many vision disorders. Because of the importance of this issue, various data sets have been collected from the cornea of the eye to facilitate the process of making neural network models . However, m ost of the methods introduced in the past are computationally complicated . In this study, we tried to increase the accessibility of deep neural network models. We did this from the most basic level, i.e. changing and improving the c onvolutional layers. By doing so, we created a new general model that use our new convolutional layer named ArConv layers. Due to the proper functioning of the new layer, the model has suitable complexity for use in mobile phones and perform the task of diagnosing the presence of disease with high accuracy. The final model introduced by us has only 1.3 million parameters and compared to the MobileNetV2 model, which has 2.2 million parameters, after training the model only on the RfMiD data set under the same conditions, results showed that it had better accuracy in the final evaluation on the RfMiD test set. Keywords: Eye disease recognition, Deep convolutional neu ral networks, Machine learning, Computer aided diagnosis, Object detection. Vision is one of the most important senses in humans, according to the evolutionary characteristics of humans; vision is the largest system in brain and occupies 20 - 30% in of the cortex [1] . A s a result, it has a great impact on all aspects of life, including health, the ability to learn and work, help to others and its absence has bad consequences and severely affects people's lives. Eye diseases can cause vision disorders and blindness, and p eople who live in vulnerable communities have less access to medical diagnosis facilities, which will make the problem bigger.


Early Detection of Visual Impairments at Home Using a Smartphone Red-Eye Reflex Test

Massmann, Judith, Lichtenstein, Alexander, López, Francisco M.

arXiv.org Artificial Intelligence

Abstract-- Numerous visual impairments can be detected in red-eye reflex images from young children. The so-called Bruckner test is traditionally performed by ophthalmologists in clinical settings. Thanks to the recent technological advances in smartphones and artificial intelligence, it is now possible to recreate the Bruckner test using a mobile device. In this paper, we present a first study conducted during the development of KidsVisionCheck, a free application that can perform vision screening with a mobile device using red-eye reflex images. The underlying model relies on deep neural networks trained on children's pupil images collected and labeled by an ophthalmologist. With an accuracy of 90% on unseen test data, our model provides highly reliable performance without the necessity of specialist equipment. Furthermore, we can identify the optimal conditions for data collection, which can in turn be used to provide immediate feedback to the users. In summary, this work marks a first step toward accessible pediatric vision screenings and early intervention for vision abnormalities worldwide.


DRetNet: A Novel Deep Learning Framework for Diabetic Retinopathy Diagnosis

Okuwobi, Idowu Paul, Liu, Jingyuan, Wan, Jifeng, Jiang, Jiaojiao

arXiv.org Artificial Intelligence

Diabetic retinopathy (DR) is a leading cause of blindness worldwide, necessitating early detection to prevent vision loss. Current automated DR detection systems often struggle with poor-quality images, lack interpretability, and insufficient integration of domain-specific knowledge. To address these challenges, we introduce a novel framework that integrates three innovative contributions: (1) Adaptive Retinal Image Enhancement Using Physics-Informed Neural Networks (PINNs): this technique dynamically enhances retinal images by incorporating physical constraints, improving the visibility of critical features such as microaneurysms, hemorrhages, and exudates; (2) Hybrid Feature Fusion Network (HFFN): by combining deep learning embeddings with handcrafted features, HFFN leverages both learned representations and domain-specific knowledge to enhance generalization and accuracy; (3) Multi-Stage Classifier with Uncertainty Quantification: this method breaks down the classification process into logical stages, providing interpretable predictions and confidence scores, thereby improving clinical trust. The proposed framework achieves an accuracy of 92.7%, a precision of 92.5%, a recall of 92.6%, an F1-score of 92.5%, an AUC of 97.8%, a mAP of 0.96, and an MCC of 0.85. Ophthalmologists rated the framework's predictions as highly clinically relevant (4.8/5), highlighting its alignment with real-world diagnostic needs. Qualitative analyses, including Grad-CAM visualizations and uncertainty heatmaps, further enhance the interpretability and trustworthiness of the system. The framework demonstrates robust performance across diverse conditions, including low-quality images, noisy data, and unseen datasets. These features make the proposed framework a promising tool for clinical adoption, enabling more accurate and reliable DR detection in resource-limited settings.


CataractSurg-80K: Knowledge-Driven Benchmarking for Structured Reasoning in Ophthalmic Surgery Planning

Meng, Yang, Pan, Zewen, Lu, Yandi, Huang, Ruobing, Liao, Yanfeng, Yang, Jiarui

arXiv.org Artificial Intelligence

Cataract surgery remains one of the most widely performed and effective procedures for vision restoration. Effective surgical planning requires integrating diverse clinical examinations for patient assessment, intraocular lens (IOL) selection, and risk evaluation. Large language models (LLMs) have shown promise in supporting clinical decision-making. However, existing LLMs often lack the domain-specific expertise to interpret heterogeneous ophthalmic data and provide actionable surgical plans. To enhance the model's ability to interpret heterogeneous ophthalmic reports, we propose a knowledge-driven Multi-Agent System (MAS), where each agent simulates the reasoning process of specialist ophthalmologists, converting raw clinical inputs into structured, actionable summaries in both training and deployment stages. Building on MAS, we introduce CataractSurg-80K, the first large-scale benchmark for cataract surgery planning that incorporates structured clinical reasoning. Each case is annotated with diagnostic questions, expert reasoning chains, and structured surgical recommendations. We further introduce Qwen-CSP, a domain-specialized model built on Qwen-4B, fine-tuned through a multi-stage process tailored for surgical planning. Comprehensive experiments show that Qwen-CSP outperforms strong general-purpose LLMs across multiple metrics. Our work delivers a high-quality dataset, a rigorous benchmark, and a domain-adapted LLM to facilitate future research in medical AI reasoning and decision support.


BEnchmarking LLMs for Ophthalmology (BELO) for Ophthalmological Knowledge and Reasoning

Srinivasan, Sahana, Ai, Xuguang, Lo, Thaddaeus Wai Soon, Gilson, Aidan, Zou, Minjie, Zou, Ke, Kim, Hyunjae, Yang, Mingjia, Pushpanathan, Krithi, Yew, Samantha, Loke, Wan Ting, Goh, Jocelyn, Chen, Yibing, Kong, Yiming, Fu, Emily Yuelei, Hui, Michelle Ongyong, Nwanyanwu, Kristen, Dave, Amisha, Li, Kelvin Zhenghao, Sun, Chen-Hsin, Chia, Mark, Yang, Gabriel Dawei, Wong, Wendy Meihua, Chen, David Ziyou, Liu, Dianbo, Singer, Maxwell, Antaki, Fares, Del Priore, Lucian V, Jonas, Jost, Adelman, Ron, Chen, Qingyu, Tham, Yih-Chung

arXiv.org Artificial Intelligence

Current benchmarks evaluating large language models (LLMs) in ophthalmology are limited in scope and disproportionately prioritise accuracy. We introduce BELO (BEnchmarking LLMs for Ophthalmology), a standardized and comprehensive evaluation benchmark developed through multiple rounds of expert checking by 13 ophthalmologists. BELO assesses ophthalmology-related clinical accuracy and reasoning quality. Using keyword matching and a fine-tuned PubMedBERT model, we curated ophthalmology-specific multiple-choice-questions (MCQs) from diverse medical datasets (BCSC, MedMCQA, MedQA, BioASQ, and PubMedQA). The dataset underwent multiple rounds of expert checking. Duplicate and substandard questions were systematically removed. Ten ophthalmologists refined the explanations of each MCQ's correct answer. This was further adjudicated by three senior ophthalmologists. To illustrate BELO's utility, we evaluated six LLMs (OpenAI o1, o3-mini, GPT-4o, DeepSeek-R1, Llama-3-8B, and Gemini 1.5 Pro) using accuracy, macro-F1, and five text-generation metrics (ROUGE-L, BERTScore, BARTScore, METEOR, and AlignScore). In a further evaluation involving human experts, two ophthalmologists qualitatively reviewed 50 randomly selected outputs for accuracy, comprehensiveness, and completeness. BELO consists of 900 high-quality, expert-reviewed questions aggregated from five sources: BCSC (260), BioASQ (10), MedMCQA (572), MedQA (40), and PubMedQA (18). A public leaderboard has been established to promote transparent evaluation and reporting. Importantly, the BELO dataset will remain a hold-out, evaluation-only benchmark to ensure fair and reproducible comparisons of future models.


Interpretable Few-Shot Retinal Disease Diagnosis with Concept-Guided Prompting of Vision-Language Models

Mehta, Deval, Jiang, Yiwen, Jan, Catherine L, He, Mingguang, Jadhav, Kshitij, Ge, Zongyuan

arXiv.org Artificial Intelligence

Recent advancements in deep learning have shown significant potential for classifying retinal diseases using color fundus images. However, existing works predominantly rely exclusively on image data, lack interpretability in their diagnostic decisions, and treat medical professionals primarily as annotators for ground truth labeling. To fill this gap, we implement two key strategies: extracting interpretable concepts of retinal diseases using the knowledge base of GPT models and incorporating these concepts as a language component in prompt-learning to train vision-language (VL) models with both fundus images and their associated concepts. Our method not only improves retinal disease classification but also enriches few-shot and zero-shot detection (novel disease detection), while offering the added benefit of concept-based model interpretability. Our extensive evaluation across two diverse retinal fundus image datasets illustrates substantial performance gains in VL-model based few-shot methodologies through our concept integration approach, demonstrating an average improvement of approximately 5.8\% and 2.7\% mean average precision for 16-shot learning and zero-shot (novel class) detection respectively. Our method marks a pivotal step towards interpretable and efficient retinal disease recognition for real-world clinical applications.


OphthBench: A Comprehensive Benchmark for Evaluating Large Language Models in Chinese Ophthalmology

Zhou, Chengfeng, Wang, Ji, Qin, Juanjuan, Wang, Yining, Sun, Ling, Dai, Weiwei

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown significant promise across various medical applications, with ophthalmology being a notable area of focus. Many ophthalmic tasks have shown substantial improvement through the integration of LLMs. However, before these models can be widely adopted in clinical practice, evaluating their capabilities and identifying their limitations is crucial. To address this research gap and support the real-world application of LLMs, we introduce the OphthBench, a specialized benchmark designed to assess LLM performance within the context of Chinese ophthalmic practices. This benchmark systematically divides a typical ophthalmic clinical workflow into five key scenarios: Education, Triage, Diagnosis, Treatment, and Prognosis. For each scenario, we developed multiple tasks featuring diverse question types, resulting in a comprehensive benchmark comprising 9 tasks and 591 questions. This comprehensive framework allows for a thorough assessment of LLMs' capabilities and provides insights into their practical application in Chinese ophthalmology. Using this benchmark, we conducted extensive experiments and analyzed the results from 39 popular LLMs. Our evaluation highlights the current gap between LLM development and its practical utility in clinical settings, providing a clear direction for future advancements. By bridging this gap, we aim to unlock the potential of LLMs and advance their development in ophthalmology.


Federated Learning for Diabetic Retinopathy Diagnosis: Enhancing Accuracy and Generalizability in Under-Resourced Regions

Raj, Gajan Mohan, Morley, Michael G., Eslami, Mohammad

arXiv.org Artificial Intelligence

Diabetic retinopathy is the leading cause of vision loss in working-age adults worldwide, yet under-resourced regions lack ophthalmologists. Current state-of-the-art deep learning systems struggle at these institutions due to limited generalizability. This paper explores a novel federated learning system for diabetic retinopathy diagnosis with the EfficientNetB0 architecture to leverage fundus data from multiple institutions to improve diagnostic generalizability at under-resourced hospitals while preserving patient-privacy. The federated model achieved 93.21% accuracy in five-category classification on an unseen dataset and 91.05% on lower-quality images from a simulated under-resourced institution. The model was deployed onto two apps for quick and accurate diagnosis.


VisionUnite: A Vision-Language Foundation Model for Ophthalmology Enhanced with Clinical Knowledge

Li, Zihan, Song, Diping, Yang, Zefeng, Wang, Deming, Li, Fei, Zhang, Xiulan, Kinahan, Paul E., Qiao, Yu

arXiv.org Artificial Intelligence

The need for improved diagnostic methods in ophthalmology is acute, especially in the less developed regions with limited access to specialists and advanced equipment. Therefore, we introduce VisionUnite, a novel vision-language foundation model for ophthalmology enhanced with clinical knowledge. VisionUnite has been pretrained on an extensive dataset comprising 1.24 million image-text pairs, and further refined using our proposed MMFundus dataset, which includes 296,379 high-quality fundus image-text pairs and 889,137 simulated doctor-patient dialogue instances. Our experiments indicate that VisionUnite outperforms existing generative foundation models such as GPT-4V and Gemini Pro. It also demonstrates diagnostic capabilities comparable to junior ophthalmologists. VisionUnite performs well in various clinical scenarios including open-ended multi-disease diagnosis, clinical explanation, and patient interaction, making it a highly versatile tool for initial ophthalmic disease screening. VisionUnite can also serve as an educational aid for junior ophthalmologists, accelerating their acquisition of knowledge regarding both common and rare ophthalmic conditions. VisionUnite represents a significant advancement in ophthalmology, with broad implications for diagnostics, medical education, and understanding of disease mechanisms.