AITopics | robotic surgery

Collaborating Authors

robotic surgery

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery

Dong, Wenzhen, Yu, Jieming, Huang, Yiming, Wang, Hongqiu, Zhu, Lei, Chung, Albert C. S., Ren, Hongliang, Bai, Long

arXiv.org Artificial IntelligenceDec-11-2025

The recent SAM 3 and SAM 3D have introduced significant advancements over the predecessor, SAM 2, particularly with the integration of language-based segmentation and enhanced 3D perception capabilities. SAM 3 supports zero-shot segmentation across a wide range of prompts, including point, bounding box, and language-based prompts, allowing for more flexible and intuitive interactions with the model. In this empirical evaluation, we assess the performance of SAM 3 in robot-assisted surgery, benchmarking its zero-shot segmentation with point and bounding box prompts and exploring its effectiveness in dynamic video tracking, alongside its newly introduced language prompt segmentation. While language prompts show potential, their performance in the surgical domain is currently suboptimal, highlighting the need for further domain-specific training. Additionally, we investigate SAM 3D's depth reconstruction abilities, demonstrating its capacity to process surgical scene data and reconstruct 3D anatomical structures from 2D images. Through comprehensive testing on the MICCAI EndoVis 2017 and En-doVis 2018 benchmarks, SAM 3 shows clear improvements over SAM and SAM 2 in both image and video segmentation under spatial prompts, while the zero-shot evaluations of SAM 3D on SCARED, StereoMIS, and EndoNeRF indicate strong monocular depth estimation and realistic 3D instrument reconstruction, yet also reveal remaining limitations in complex, highly dynamic surgical scenes.

large language model, machine learning, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2512.07596

Country:

Asia > China > Hong Kong (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.50)

Add feedback

LapSurgie: Humanoid Robots Performing Surgery via Teleoperated Handheld Laparoscopy

Liang, Zekai, Liang, Xiao, Atar, Soofiyan, Das, Sreyan, Chiu, Zoe, Zhang, Peihan, Richter, Florian, Liu, Shanglei, Yip, Michael C.

arXiv.org Artificial IntelligenceOct-7-2025

Robotic laparoscopic surgery has gained increasing attention in recent years for its potential to deliver more efficient and precise minimally invasive procedures. However, adoption of surgical robotic platforms remains largely confined to high-resource medical centers, exacerbating healthcare disparities in rural and low-resource regions. To close this gap, a range of solutions has been explored, from remote mentorship to fully remote telesurgery. Yet, the practical deployment of surgical robotic systems to underserved communities remains an unsolved challenge. Humanoid systems offer a promising path toward deployability, as they can directly operate in environments designed for humans without extensive infrastructure modifications -- including operating rooms. In this work, we introduce LapSurgie, the first humanoid-robot-based laparoscopic teleoperation framework. The system leverages an inverse-mapping strategy for manual-wristed laparoscopic instruments that abides to remote center-of-motion constraints, enabling precise hand-to-tool control of off-the-shelf surgical laparoscopic tools without additional setup requirements. A control console equipped with a stereo vision system provides real-time visual feedback. Finally, a comprehensive user study across platforms demonstrates the effectiveness of the proposed framework and provides initial evidence for the feasibility of deploying humanoid robots in laparoscopic procedures.

artificial intelligence, platform, surgery, (15 more...)

arXiv.org Artificial Intelligence

2510.03529

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
North America > United States > New York > Tompkins County > Ithaca (0.04)
Europe > United Kingdom > England (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Surgery (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.81)

Add feedback

Surgical-MambaLLM: Mamba2-enhanced Multimodal Large Language Model for VQLA in Robotic Surgery

Hao, Pengfei, Wang, Hongqiu, Li, Shuaibo, Xing, Zhaohu, Yang, Guang, Wu, Kaishun, Zhu, Lei

arXiv.org Artificial IntelligenceSep-23-2025

In recent years, Visual Question Localized-Answering in robotic surgery (Surgical-VQLA) has gained significant attention for its potential to assist medical students and junior doctors in understanding surgical scenes. Recently, the rapid development of Large Language Models (LLMs) has provided more promising solutions for this task. However, current methods struggle to establish complex dependencies between text and visual details, and have difficulty perceiving the spatial information of surgical scenes. To address these challenges, we propose a novel method, Surgical-MambaLLM, which is the first to combine Mamba2 with LLM in the surgical domain, that leverages Mamba2's ability to effectively capture cross-modal dependencies and perceive spatial information in surgical scenes, thereby enhancing the LLMs' understanding of surgical images. Specifically, we propose the Cross-modal Bidirectional Mamba2 Integration (CBMI) module to leverage Mamba2 for effective multimodal fusion, with its cross-modal integration capabilities. Additionally, tailored to the geometric characteristics of surgical scenes, we design the Surgical Instrument Perception (SIP) scanning mode for Mamba2 to scan the surgical images, enhancing the model's spatial understanding of the surgical scene. Extensive experiments demonstrate that our Surgical-MambaLLM model outperforms the state-of-the-art methods on the EndoVis17-VQLA and EndoVis18-VQLA datasets, significantly improving the performance of the Surgical-VQLA task.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.16618

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > Promising Solution (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

RoboTwin: A Robotic Teleoperation Framework Using Digital Twins

Yelchuri, Harsha, Singh, Diwakar Kumar, Gnani, Nithish Krishnabharathi, Prabhakar, T V, Singh, Chandramani

arXiv.org Artificial IntelligenceJun-3-2025

--Robotic surgery imposes a significant cognitive burden on the surgeon. This cognitive burden increases in the case of remote robotic surgeries due to latency between entities and thus might affect the quality of surgery. Here, the patient side and the surgeon side are geographically separated by hundreds to thousands of kilometres. Real-time teleoperation of robots requires strict latency bounds for control and feedback. We propose a dual digital twin (DT) framework and explain the simulation environment and teleoperation framework. Here, the doctor visually controls the locally available DT of the patient side and thus experiences minimum latency. The second digital twin serves two purposes. Firstly, it provides a layer of safety for operator-related mishaps, and secondly, it conveys the coordinates of known and unknown objects back to the operator's side digital twin. We show that teleoperation accuracy and user experience are enhanced with our approach. Experimental results using the NASA-TLX metric show that the quality of surgery is vastly improved with DT, perhaps due to reduced cognitive burden. The network data rate for identifying objects at the operator side is 25x lower than normal.

artificial intelligence, digital twin, human computer interaction, (17 more...)

arXiv.org Artificial Intelligence

2506.01027

Country:

North America > United States (0.49)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)
Government > Regional Government > North America Government > United States Government (0.49)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Confidence-based Intent Prediction for Teleoperation in Bimanual Robotic Suturing

Hu, Zhaoyang Jacopo, Xu, Haozheng, Kim, Sion, Li, Yanan, Baena, Ferdinando Rodriguez y, Burdet, Etienne

arXiv.org Artificial IntelligenceApr-30-2025

--Robotic-assisted procedures offer enhanced precision, but while fully autonomous systems are limited in task knowledge, difficulties in modeling unstructured environments, and generalisation abilities, fully manual teleoperated systems also face challenges such as delay, stability, and reduced sensory information. T o address these, we developed an interactive control strategy that assists the human operator by predicting their motion plan at both high and low levels. At the high level, a surgeme recognition system is employed through a Transformer-based real-time gesture classification model to dynamically adapt to the operator's actions, while at the low level, a Confidence-based Intention Assimilation Controller adjusts robot actions based on user intent and shared control paradigms. The system is built around a robotic suturing task, supported by sensors that capture the kinematics of the robot and task dynamics. Experiments across users with varying skill levels demonstrated the effectiveness of the proposed approach, showing statistically significant improvements in task completion time and user satisfaction compared to traditional teleoperation. N traditional teleoperation the human operator fully controls the robot's movements [1]. Robots like the da Vinci Surgical System are equipped with sensors and models offering valuable local information inaccessible to the human operator, such as during visual occlusions or operations with different sensory modalities. By spanning across the spectrum between traditional fully manual teleoperation and full autonomy, shared control leverages the benefits of both to enhance teleoperation with the robot's sensory data and control [2]. While demonstrated for suturing assistance [3], [4], these methods overlook the impact on positional uncertainty, environmental unknowns, or instrument errors. For example, robotic surgery cameras are frequently occluded by body tissues or parts of the robot [5].

artificial intelligence, robot, traditional teleoperation, (16 more...)

arXiv.org Artificial Intelligence

2504.20761

Country:

Europe > United Kingdom (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.47)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)

Add feedback

Can DeepSeek Reason Like a Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery

Ma, Boyi, Zhao, Yanguang, Wang, Jie, Wang, Guankun, Yuan, Kun, Chen, Tong, Bai, Long, Ren, Hongliang

arXiv.org Artificial IntelligenceApr-3-2025

The DeepSeek models have shown exceptional performance in general scene understanding, question-answering (QA), and text generation tasks, owing to their efficient training paradigm and strong reasoning capabilities. In this study, we investigate the dialogue capabilities of the DeepSeek model in robotic surgery scenarios, focusing on tasks such as Single Phrase QA, Visual QA, and Detailed Description. The Single Phrase QA tasks further include sub-tasks such as surgical instrument recognition, action understanding, and spatial position analysis. We conduct extensive evaluations using publicly available datasets, including EndoVis18 and CholecT50, along with their corresponding dialogue data. Our empirical study shows that, compared to existing general-purpose multimodal large language models, DeepSeek-VL2 performs better on complex understanding tasks in surgical scenes. Additionally, although DeepSeek-V3 is purely a language model, we find that when image tokens are directly inputted, the model demonstrates better performance on single-sentence QA tasks. However, overall, the DeepSeek models still fall short of meeting the clinical requirements for understanding surgical scenes. Under general prompts, DeepSeek models lack the ability to effectively analyze global surgical concepts and fail to provide detailed insights into surgical scenarios. Based on our observations, we argue that the DeepSeek models are not ready for vision-language tasks in surgical contexts without fine-tuning on surgery-specific datasets.

arxiv preprint arxiv, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2503.2313

Country:

Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
Asia > China > Hong Kong (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhanced Position Estimation in Tactile Internet-Enabled Remote Robotic Surgery Using MOESP-Based Kalman Filter

Lashari, Muhammad Hanif, Batayneh, Wafa, Khokhar, Ashfaq, Ahmed, Shakil

arXiv.org Artificial IntelligenceJan-27-2025

Accurately estimating the position of a patient's side robotic arm in real time during remote surgery is a significant challenge, especially within Tactile Internet (TI) environments. This paper presents a new and efficient method for position estimation using a Kalman Filter (KF) combined with the Multivariable Output-Error State Space (MOESP) method for system identification. Unlike traditional approaches that require prior knowledge of the system's dynamics, this study uses the JIGSAW dataset, a comprehensive collection of robotic surgical data, along with input from the Master Tool Manipulator (MTM) to derive the state-space model directly. The MOESP method allows accurate modeling of the Patient Side Manipulator (PSM) dynamics without prior system models, improving the KF's performance under simulated network conditions, including delays, jitter, and packet loss. These conditions mimic real-world challenges in Tactile Internet applications. The findings demonstrate the KF's improved resilience and accuracy in state estimation, achieving over 95 percent accuracy despite network-induced uncertainties.

application, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.16485

Country:

North America > United States > Iowa > Story County > Ames (0.14)
North America > United States > California (0.14)
North America > United States > Utah (0.04)
(6 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Health & Medicine > Surgery (0.85)
Health & Medicine > Health Care Technology (0.85)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Predictive Approach for Enhancing Accuracy in Remote Robotic Surgery Using Informer Model

Lashari, Muhammad Hanif, Ahmed, Shakil, Batayneh, Wafa, Khokhar, Ashfaq

arXiv.org Artificial IntelligenceJan-24-2025

Precise and real-time estimation of the robotic arm's position on the patient's side is essential for the success of remote robotic surgery in Tactile Internet (TI) environments. This paper presents a prediction model based on the Transformer-based Informer framework for accurate and efficient position estimation. Additionally, it combines a Four-State Hidden Markov Model (4-State HMM) to simulate realistic packet loss scenarios. The proposed approach addresses challenges such as network delays, jitter, and packet loss to ensure reliable and precise operation in remote surgical applications. The method integrates the optimization problem into the Informer model by embedding constraints such as energy efficiency, smoothness, and robustness into its training process using a differentiable optimization layer. The Informer framework uses features such as ProbSparse attention, attention distilling, and a generative-style decoder to focus on position-critical features while maintaining a low computational complexity of O(L log L). The method is evaluated using the JIGSAWS dataset, achieving a prediction accuracy of over 90 percent under various network scenarios. A comparison with models such as TCN, RNN, and LSTM demonstrates the Informer framework's superior performance in handling position prediction and meeting real-time requirements, making it suitable for Tactile Internet-enabled robotic surgery.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.14678

Country:

North America > United States > Iowa > Story County > Ames (0.14)
North America > United States > California (0.14)
North America > United States > Utah (0.04)
(6 more...)

Genre:

Personal (0.68)
Research Report (0.64)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (0.92)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

An innovative mixed reality approach for Robotics Surgery

Rus, Gabriela, Hajjar, Nadim Al, Zima, Ionut, Vaida, Calin, Radu, Corina, Chablat, Damien, Ciocan, Andra, Pîslă, Doina

arXiv.org Artificial IntelligenceJan-7-2025

Robotic-assisted procedures offer numerous advantages over traditional approaches, including improved dexterity, reduced fatigue, minimized trauma, and superior outcomes. However, the main challenge of these systems remains the poor visualization and perception of the surgical field. The goal of this paper is to provide an innovative approach concerning an application able to improve the surgical procedures offering assistance in both preplanning and intraoperative steps of the surgery. The system has been designed to offer a better understanding of the patient through techniques that provide medical images visualization, 3D anatomical structures perception and robotic planning. The application was designed to be intuitive and user friendly, providing an augmented reality experience through the Hololens 2 device. It was tested in laboratory conditions, yielding positive results.

application, artificial intelligence, procedure, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1088/1757-899X/1320/1/012010

2501.03819

Country:

Europe > Romania > Nord-Vest Development Region > Cluj County > Cluj-Napoca (0.05)
Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.05)
North America > Canada > Quebec > Montreal (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

The ethical landscape of robot-assisted surgery. A systematic review

Haltaufderheide, Joschka, Pfisterer-Heise, Stefanie, Pieper, Dawid, Ranisch, Robert

arXiv.org Artificial IntelligenceNov-18-2024

Background: Robot-assisted surgery has been widely adopted in recent years. However, compared to other health technologies operating in close proximity to patients in a vulnerable state, ethical issues of robot-assisted surgery have received less attention. Against the background of increasing automation that are expected to raise new ethical issues, this systematic review aims to map the state of the ethical debate in this field. Methods: A protocol was registered in the international prospective register of systematic reviews (PROSPERO CRD42023397951). Medline via PubMed, EMBASE, CINHAL, Philosophers' Index, IEEE Xplorer, Web of Science (Core Collection), Scopus and Google Scholar were searched in January 2023. Screening, extraction, and analysis were conducted independently by two authors. A qualitative narrative synthesis was performed. Results: Out of 1,723 records, 66 records were included in the final dataset. Seven major strands of the ethical debate emerged during analysis. These include questions of harms and benefits, responsibility and control, professional-patient relationship, ethical issues in surgical training and learning, justice, translational questions, and economic considerations. Discussion: The identified themes testify to a broad range of different and differing ethical issues requiring careful deliberation and integration into the surgical ethos. Looking forward, we argue that a different perspective in addressing robotic surgical devices might be helpful to consider upcoming challenges of automation.

artificial intelligence, surgeon, surgery, (16 more...)

arXiv.org Artificial Intelligence

2411.11637

Country:

Europe > Germany > Brandenburg > Potsdam (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Surgery (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.46)

Add feedback