cataract surgery
Cataract-LMM: Large-Scale, Multi-Source, Multi-Task Benchmark for Deep Learning in Surgical Video Analysis
Ahmadi, Mohammad Javad, Gandomi, Iman, Abdi, Parisa, Mohammadi, Seyed-Farzad, Taslimi, Amirhossein, Khodaparast, Mehdi, Hashemi, Hassan, Tavakoli, Mahdi, Taghirad, Hamid D.
The persistent gap between the growing global surgical demand and the trained surgical workforce [1] highlights the need to develop scalable solutions that can enhance training paradigms and optimize workflow management [2]. Computer-assisted surgery (CAS) systems are one approach to address this challenge, with applications in preoperative planning [3], intraoperative guidance [4], and standardized postoperative assessment [5, 6]. The development and validation of these advanced CAS capabilities fundamentally depend on access to large-scale, deeply annotated surgical video datasets that capture procedural phases, instrument-tissue interactions, and technical skill cues [7, 8]. Phacoemulsification cataract surgery is the most common ophthalmic procedure worldwide and the primary intervention for avoidable blindness [9, 10]. This makes it a critical domain for developing data-driven CAS with potential applications in clinical workflows and training [11, 12]. Publicly available datasets for developing CAS in cataract surgery, such as Cataract-1K [13] and CaDIS [14], are limited by their single-center origin and limited annotation scopes [15]. The absence of a multi-source dataset with comprehensive and multi-layered annotations, including objective skill assessments, has limited the development of generalizable multi-task deep learning models [11]. To address this gap, we present the Cataract-LMM (Large-scale, Multi-source, Multi-task) Dataset, a dataset of 3,000 phacoemulsification procedures recorded at two distinct clinical centers (Farabi and Noor Eye Hospitals, Tehran, Iran) between December 2021 and March 2025. The dataset is enriched with four complementary layers of annotations on subsets of the data: 1. Temporal Phase Labels (Phase): Frame-wise annotations for 13 surgical phases across 150 videos to support automated workflow recognition.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.26)
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Surgery (1.00)
CataractSurg-80K: Knowledge-Driven Benchmarking for Structured Reasoning in Ophthalmic Surgery Planning
Meng, Yang, Pan, Zewen, Lu, Yandi, Huang, Ruobing, Liao, Yanfeng, Yang, Jiarui
Cataract surgery remains one of the most widely performed and effective procedures for vision restoration. Effective surgical planning requires integrating diverse clinical examinations for patient assessment, intraocular lens (IOL) selection, and risk evaluation. Large language models (LLMs) have shown promise in supporting clinical decision-making. However, existing LLMs often lack the domain-specific expertise to interpret heterogeneous ophthalmic data and provide actionable surgical plans. To enhance the model's ability to interpret heterogeneous ophthalmic reports, we propose a knowledge-driven Multi-Agent System (MAS), where each agent simulates the reasoning process of specialist ophthalmologists, converting raw clinical inputs into structured, actionable summaries in both training and deployment stages. Building on MAS, we introduce CataractSurg-80K, the first large-scale benchmark for cataract surgery planning that incorporates structured clinical reasoning. Each case is annotated with diagnostic questions, expert reasoning chains, and structured surgical recommendations. We further introduce Qwen-CSP, a domain-specialized model built on Qwen-4B, fine-tuned through a multi-stage process tailored for surgical planning. Comprehensive experiments show that Qwen-CSP outperforms strong general-purpose LLMs across multiple metrics. Our work delivers a high-quality dataset, a rigorous benchmark, and a domain-adapted LLM to facilitate future research in medical AI reasoning and decision support.
Language Enhanced Model for Eye (LEME): An Open-Source Ophthalmology-Specific Large Language Model
Gilson, Aidan, Ai, Xuguang, Xie, Qianqian, Srinivasan, Sahana, Pushpanathan, Krithi, Singer, Maxwell B., Huang, Jimin, Kim, Hyunjae, Long, Erping, Wan, Peixing, Del Priore, Luciano V., Ohno-Machado, Lucila, Xu, Hua, Liu, Dianbo, Adelman, Ron A., Tham, Yih-Chung, Chen, Qingyu
Large Language Models (LLMs) are poised to revolutionize healthcare. Ophthalmology-specific LLMs remain scarce and underexplored. We introduced an open-source, specialized LLM for ophthalmology, termed Language Enhanced Model for Eye (LEME). LEME was initially pre-trained on the Llama2 70B framework and further fine-tuned with a corpus of ~127,000 non-copyrighted training instances curated from ophthalmology-specific case reports, abstracts, and open-source study materials. We benchmarked LEME against eight other LLMs, namely, GPT-3.5, GPT-4, three Llama2 models (7B, 13B, 70B), PMC-LLAMA 13B, Meditron 70B, and EYE-Llama (another ophthalmology-specific LLM). Evaluations included four internal validation tasks: abstract completion, fill-in-the-blank, multiple-choice questions (MCQ), and short-answer QA. External validation tasks encompassed long-form QA, MCQ, patient EHR summarization, and clinical QA. Evaluation metrics included Rouge-L scores, accuracy, and expert evaluation of correctness, completeness, and readability. In internal validations, LEME consistently outperformed its counterparts, achieving Rouge-L scores of 0.20 in abstract completion (all p<0.05), 0.82 in fill-in-the-blank (all p<0.0001), and 0.22 in short-answer QA (all p<0.0001, except versus GPT-4). In external validations, LEME excelled in long-form QA with a Rouge-L of 0.19 (all p<0.0001), ranked second in MCQ accuracy (0.68; all p<0.0001), and scored highest in EHR summarization and clinical QA (ranging from 4.24 to 4.83 out of 5 for correctness, completeness, and readability). LEME's emphasis on robust fine-tuning and the use of non-copyrighted data represents a breakthrough in open-source ophthalmology-specific LLMs, offering the potential to revolutionize execution of clinical tasks while democratizing research collaboration.
- North America > United States > Maryland > Montgomery County > Bethesda (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
CataractBot: An LLM-Powered Expert-in-the-Loop Chatbot for Cataract Patients
Ramjee, Pragnya, Sachdeva, Bhuvan, Golechha, Satvik, Kulkarni, Shreyas, Fulari, Geeta, Murali, Kaushik, Jain, Mohit
The healthcare landscape is evolving, with patients seeking more reliable information about their health conditions, treatment options, and potential risks. Despite the abundance of information sources, the digital age overwhelms individuals with excess, often inaccurate information. Patients primarily trust doctors and hospital staff, highlighting the need for expert-endorsed health information. However, the pressure on experts has led to reduced communication time, impacting information sharing. To address this gap, we propose CataractBot, an experts-in-the-loop chatbot powered by large language models (LLMs). Developed in collaboration with a tertiary eye hospital in India, CataractBot answers cataract surgery related questions instantly by querying a curated knowledge base, and provides expert-verified responses asynchronously. CataractBot features multimodal support and multilingual capabilities. In an in-the-wild deployment study with 49 participants, CataractBot proved valuable, providing anytime accessibility, saving time, and accommodating diverse literacy levels. Trust was established through expert verification. Broadly, our results could inform future work on designing expert-mediated LLM bots.
- Asia > India > Karnataka > Bengaluru (0.14)
- North America > United States > California (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (14 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Surgery (1.00)
- Health & Medicine > Health Care Providers & Services (1.00)
- Health & Medicine > Consumer Health (1.00)
Toward a Surgeon-in-the-Loop Ophthalmic Robotic Apprentice using Reinforcement and Imitation Learning
Gomaa, Amr, Mahdy, Bilal, Kleer, Niko, Krüger, Antonio
Robotic-assisted surgical systems have demonstrated significant potential in enhancing surgical precision and minimizing human errors. However, existing systems lack the ability to accommodate the unique preferences and requirements of individual surgeons. Additionally, they primarily focus on general surgeries (e.g., laparoscopy) and are not suitable for highly precise microsurgeries, such as ophthalmic procedures. Thus, we propose a simulation-based image-guided approach for surgeon-centered autonomous agents that can adapt to the individual surgeon's skill level and preferred surgical techniques during ophthalmic cataract surgery. Our approach utilizes a simulated environment to train reinforcement and imitation learning agents guided by image data to perform all tasks of the incision phase of cataract surgery. By integrating the surgeon's actions and preferences into the training process with the surgeon-in-the-loop, our approach enables the robot to implicitly learn and adapt to the individual surgeon's unique approach through demonstrations. This results in a more intuitive and personalized surgical experience for the surgeon. Simultaneously, it ensures consistent performance for the autonomous robotic apprentice. We define and evaluate the effectiveness of our approach using our proposed metrics; and highlight the trade-off between a generic agent and a surgeon-centered adapted agent. Moreover, our approach has the potential to extend to other ophthalmic surgical procedures, opening the door to a new generation of surgeon-in-the-loop autonomous surgical robots. We provide an open-source simulation framework for future development and reproducibility.
- North America > United States > Missouri (0.04)
- Europe > Germany > Saarland > Saarbrücken (0.04)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Surgery (1.00)
Can artificial intelligence help reduce disparities in medical care?
Click here to read the Cover Story, "Pandemic spurs paradigm shift in artificial intelligence." Any question about the utility of a tool is best answered by giving it a go. Try the tool, compare it with others, change the design to improve it. One might indeed be able to drive a nail with a rock, pistol or hoe, but it should not take long to figure out that a hammer, a blob of metal on a stick, is better suited to driving nails. Computers and software are tools.
Using AI to Make Knowledge Workers More Effective
New AI capabilities that can recognize context, concepts, and meaning are opening up surprising new pathways for collaboration between knowledge workers and machines. Experts can now provide more of their own input for training, quality control, and fine-tuning of AI outcomes. Machines can augment the expertise of their human collaborators and sometimes help create new experts. These systems, in more closely mimicking human intelligence, are proving to be more robust than the big data-driven systems that came before them. And they could profoundly affect the 48% of the US workforce that are knowledge workers--and the more than 230 million knowledge-worker roles globally.
- North America > United States (0.25)
- Europe > United Kingdom (0.05)
- Europe > France > Brittany (0.05)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (0.71)
- Information Technology > Artificial Intelligence > Cognitive Science (0.50)
- Information Technology > Sensing and Signal Processing > Image Processing (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)
DeepPhase: Surgical Phase Recognition in CATARACTS Videos
Zisimopoulos, Odysseas, Flouty, Evangello, Luengo, Imanol, Giataganas, Petros, Nehme, Jean, Chow, Andre, Stoyanov, Danail
Automated surgical workflow analysis and understanding can assist surgeons to standardize procedures and enhance post-surgical assessment and indexing, as well as, interventional monitoring. Computer-assisted interventional (CAI) systems based on video can perform workflow estimation through surgical instruments' recognition while linking them to an ontology of procedural phases. In this work, we adopt a deep learning paradigm to detect surgical instruments in cataract surgery videos which in turn feed a surgical phase inference recurrent network that encodes temporal aspects of phase steps within the phase classification. Our models present comparable to state-of-the-art results for surgical tool detection and phase recognition with accuracies of 99 and 78% respectively.
- Health & Medicine > Surgery (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.69)
D-Wave Launches Machine Learning Services Business
Quantum computing pioneer D-Wave Systems today announced a new business unit – Quadrant – to provide machine learning services based on lessons from its quantum computing research. Quadrant will specialize in the use of generative learning models which require smaller sets of labelled data to generate models than typical discriminative methods. As a proof point of the approach's power, D-Wave is calling attention to its winning effort in a recent Siemens medical imaging grand challenge – CATARACT – to automate identification of surgical instruments used in cataract surgery. "D-Wave is committed to tackling real-world problems, today. Quadrant is a natural extension of the scientific and technological advances from D-Wave as we continue to explore new applications for our quantum systems," said Vern Brownell, CEO at D-Wave in today's announcement.
Data-Efficient Machine Learning - insideBIGDATA
From Quadrant (a D-Wave business), this whitepaper "Data-Efficient Machine Learning" describes a practical impediment to the application of deep neural network models when large training data sets are unavailable. Encouragingly however, it is shown that recent machine learning advances make it possible to obtain the benefits of deep neural networks by making more efficient use of training data that most practitioners do have. Quadrant leverages generative machine learning, which requires much less labeled data than common discriminative models. This is incredibly useful in countless applications, including medical imaging which is often limited to relatively small data sets (i.e. For a first case study, Siemens Healthineers partnered with Quadrant to identify surgical tools used in cataract surgery with 99.71% accuracy.
- Health & Medicine > Surgery (0.86)
- Health & Medicine > Health Care Technology (0.82)
- Health & Medicine > Diagnostic Medicine > Imaging (0.43)