AITopics | Stoyanov, Danail

Collaborating Authors

Stoyanov, Danail

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

3D Acetabular Surface Reconstruction from 2D Pre-operative X-ray Images using SRVF Elastic Registration and Deformation Graph

Zhang, Shuai, Wang, Jinliang, Konandetails, Sujith, Wang, Xu, Stoyanov, Danail, Mazomenos, Evangelos B.

arXiv.org Artificial IntelligenceMar-28-2025

Accurate and reliable selection of the appropriate acetabular cup size is crucial for restoring joint biomechanics in total hip arthroplasty (THA). This paper proposes a novel framework that integrates square-root velocity function (SRVF)-based elastic shape registration technique with an embedded deformation (ED) graph approach to reconstruct the 3D articular surface of the acetabulum by fusing multiple views of 2D pre-operative pelvic X-ray images and a hemispherical surface model. The SRVF-based elastic registration establishes 2D-3D correspondences between the parametric hemispherical model and X-ray images, and the ED framework incorporates the SRVF-derived correspondences as constraints to optimize the 3D acetabular surface reconstruction using nonlinear least-squares optimization. Validations using both simulation and real patient datasets are performed to demonstrate the robustness and the potential clinical value of the proposed algorithm. The reconstruction result can assist surgeons in selecting the correct acetabular cup on the first attempt in primary THA, minimising the need for revision surgery. Code and data will be released upon acceptance.

artificial intelligence, surface model, x-ray image, (15 more...)

arXiv.org Artificial Intelligence

2503.22177

Country:

Europe > United Kingdom (0.46)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Add feedback

PitVQA++: Vector Matrix-Low-Rank Adaptation for Open-Ended Visual Question Answering in Pituitary Surgery

He, Runlong, Khan, Danyal Z., Mazomenos, Evangelos B., Marcus, Hani J., Stoyanov, Danail, Clarkson, Matthew J., Islam, Mobarakol

arXiv.org Artificial IntelligenceFeb-19-2025

-- Vision-Language Models (VLMs) in visual question answering (VQA) offer a unique opportunity to enhance intra-operative decision-making, promote intuitive interactions, and significantly advancing surgical education. However, the development of VLMs for surgical VQA is challenging due to limited datasets and the risk of overfitting and catastrophic forgetting during full fine-tuning of pretrained weights. While parameter-efficient techniques like Low-Rank Adaptation (LoRA) and Matrix of Rank Adaptation (MoRA) address adaptation challenges, their uniform parameter distribution overlooks the feature hierarchy in deep networks, where earlier layers, that learn general features, require more parameters than later ones. This work introduces PitVQA++ with an open-ended PitVQA dataset and vector matrix-low-rank adaptation (V ector-MoLoRA), an innovative VLM fine-tuning approach for adapting GPT -2 to pituitary surgery. Open-Ended PitVQA comprises around 101,803 frames from 25 procedural videos with 745,972 question-answer sentence pairs, covering key surgical elements such as phase and step recognition, context understanding, tool detection, localization, and interactions recognition. V ector-MoLoRA incorporates the principles of LoRA and MoRA to develop a matrix-low-rank adaptation strategy that employs vector ranking to allocate more parameters to earlier layers, gradually reducing them in the later layers. Furthermore, our risk-coverage analysis highlights its enhanced reliability and trustworthiness in handling uncertain predictions.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.14149

Genre: Research Report > Promising Solution (0.46)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

RRT-GPMP2: A Motion Planner for Mobile Robots in Complex Maze Environments

Meng, Jiawei, Stoyanov, Danail

arXiv.org Artificial IntelligenceDec-10-2024

With the development of science and technology, mobile robots are playing a significant important role in the new round of world revolution. Further, mobile robots might assist or replace human beings in a great number of areas. To increase the degree of automation for mobile robots, advanced motion planners need to be integrated into them to cope with various environments. Complex maze environments are common in the potential application scenarios of different mobile robots. This article proposes a novel motion planner named the rapidly exploring random tree based Gaussian process motion planner 2, which aims to tackle the motion planning problem for mobile robots in complex maze environments. To be more specific, the proposed motion planner successfully combines the advantages of a trajectory optimisation motion planning algorithm named the Gaussian process motion planner 2 and a sampling-based motion planning algorithm named the rapidly exploring random tree. To validate the performance and practicability of the proposed motion planner, we have tested it in several simulations in the Matrix laboratory and applied it on a marine mobile robot in a virtual scenario in the Robotic operating system.

artificial intelligence, motion planner, planning & scheduling, (16 more...)

arXiv.org Artificial Intelligence

2412.07683

Country: Europe (0.68)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Dynamic Obstacle Avoidance of Unmanned Surface Vehicles in Maritime Environments Using Gaussian Processes Based Motion Planning

Meng, Jiawei, Liu, Yuanchang, Stoyanov, Danail

arXiv.org Artificial IntelligenceDec-10-2024

During recent years, unmanned surface vehicles are extensively utilised in a variety of maritime applications such as the exploration of unknown areas, autonomous transportation, offshore patrol and others. In such maritime applications, unmanned surface vehicles executing relevant missions that might collide with potential static obstacles such as islands and reefs and dynamic obstacles such as other moving unmanned surface vehicles. To successfully accomplish these missions, motion planning algorithms that can generate smooth and collision-free trajectories to avoid both these static and dynamic obstacles in an efficient manner are essential. In this article, we propose a novel motion planning algorithm named the Dynamic Gaussian process motion planner 2, which successfully extends the application scope of the Gaussian process motion planner 2 into complex and dynamic environments with both static and dynamic obstacles. First, we introduce an approach to generate safe areas for dynamic obstacles using modified multivariate Gaussian distributions. Second, we introduce an approach to integrate real-time status information of dynamic obstacles into the modified multivariate Gaussian distributions. Therefore, the multivariate Gaussian distributions with real-time statuses of dynamic obstacles can be innovatively added into the optimisation process of factor graph to generate an optimised trajectory. The proposed Dynamic Gaussian process motion planner 2 algorithm has been validated in a series of benchmark simulations in the Matrix laboratory and a dynamic obstacle avoidance mission in a high-fidelity maritime environment in the Robotic operating system to demonstrate its functionality and practicability.

artificial intelligence, dynamic obstacle, planning & scheduling, (16 more...)

arXiv.org Artificial Intelligence

2412.07664

Country: Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Industry: Transportation (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

NCDD: Nearest Centroid Distance Deficit for Out-Of-Distribution Detection in Gastrointestinal Vision

Pokhrel, Sandesh, Bhandari, Sanjay, Ali, Sharib, Lambrou, Tryphon, Nguyen, Anh, Shrestha, Yash Raj, Watson, Angus, Stoyanov, Danail, Gyawali, Prashnna, Bhattarai, Binod

arXiv.org Artificial IntelligenceDec-2-2024

The integration of deep learning tools in gastrointestinal vision holds the potential for significant advancements in diagnosis, treatment, and overall patient care. A major challenge, however, is these tools' tendency to make overconfident predictions, even when encountering unseen or newly emerging disease patterns, undermining their reliability. We address this critical issue of reliability by framing it as an out-of-distribution (OOD) detection problem, where previously unseen and emerging diseases are identified as OOD examples. However, gastrointestinal images pose a unique challenge due to the overlapping feature representations between in- Distribution (ID) and OOD examples. Existing approaches often overlook this characteristic, as they are primarily developed for natural image datasets, where feature distinctions are more apparent. Despite the overlap, we hypothesize that the features of an in-distribution example will cluster closer to the centroids of their ground truth class, resulting in a shorter distance to the nearest centroid. In contrast, OOD examples maintain an equal distance from all class centroids. Based on this observation, we propose a novel nearest-centroid distance deficit (NCCD) score in the feature space for gastrointestinal OOD detection. Evaluations across multiple deep learning architectures and two publicly available benchmarks, Kvasir2 and Gastrovision, demonstrate the effectiveness of our approach compared to several state-of-the-art methods. The code and implementation details are publicly available at: https://github.com/bhattarailab/NCDD

artificial intelligence, detection, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2412.0159

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

PitRSDNet: Predicting Intra-operative Remaining Surgery Duration in Endoscopic Pituitary Surgery

Wijekoon, Anjana, Das, Adrito, Herrera, Roxana R., Khan, Danyal Z., Hanrahan, John, Carter, Eleanor, Luoma, Valpuri, Stoyanov, Danail, Marcus, Hani J., Bano, Sophia

arXiv.org Artificial IntelligenceNov-4-2024

Accurate intra-operative Remaining Surgery Duration (RSD) predictions allow for anaesthetists to more accurately decide when to administer anaesthetic agents and drugs, as well as to notify hospital staff to send in the next patient. Therefore RSD plays an important role in improving patient care and minimising surgical theatre costs via efficient scheduling. In endoscopic pituitary surgery, it is uniquely challenging due to variable workflow sequences with a selection of optional steps contributing to high variability in surgery duration. This paper presents PitRSDNet for predicting RSD during pituitary surgery, a spatio-temporal neural network model that learns from historical data focusing on workflow sequences. PitRSDNet integrates workflow knowledge into RSD prediction in two forms: 1) multi-task learning for concurrently predicting step and RSD; and 2) incorporating prior steps as context in temporal learning and inference. PitRSDNet is trained and evaluated on a new endoscopic pituitary surgery dataset with 88 videos to show competitive performance improvements over previous statistical and machine learning methods. The findings also highlight how PitRSDNet improve RSD precision on outlier cases utilising the knowledge of prior steps.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2409.16998

Country:

Europe > United Kingdom > England > Greater London > London (0.14)
North America > Canada > Quebec (0.14)

Genre:

Research Report (1.00)
Workflow (0.91)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Surgery (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

RGB to Hyperspectral: Spectral Reconstruction for Enhanced Surgical Imaging

Czempiel, Tobias, Roddan, Alfie, Leiloglou, Maria, Hu, Zepeng, O'Neill, Kevin, Anichini, Giulio, Stoyanov, Danail, Elson, Daniel

arXiv.org Artificial IntelligenceOct-17-2024

This study investigates the reconstruction of hyperspectral signatures from RGB data to enhance surgical imaging, utilizing the publicly available HeiPorSPECTRAL dataset from porcine surgery and an in-house neurosurgery dataset. Various architectures based on convolutional neural networks (CNNs) and transformer models are evaluated using comprehensive metrics. Transformer models exhibit superior performance in terms of RMSE, SAM, PSNR and SSIM by effectively integrating spatial information to predict accurate spectral profiles, encompassing both visible and extended spectral ranges. Qualitative assessments demonstrate the capability to predict spectral profiles critical for informed surgical decision-making during procedures. Challenges associated with capturing both the visible and extended hyperspectral ranges are highlighted using the MAE, emphasizing the complexities involved. The findings open up the new research direction of hyperspectral reconstruction for surgical applications and clinical use cases in real-time surgical environments.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.1357

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (0.95)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robotic Arm Platform for Multi-View Image Acquisition and 3D Reconstruction in Minimally Invasive Surgery

Saikia, Alexander, Di Vece, Chiara, Bonilla, Sierra, He, Chloe, Magbagbeola, Morenike, Mennillo, Laurent, Czempiel, Tobias, Bano, Sophia, Stoyanov, Danail

arXiv.org Artificial IntelligenceOct-15-2024

Minimally invasive surgery (MIS) offers significant benefits such as reduced recovery time and minimised patient trauma, but poses challenges in visibility and access, making accurate 3D reconstruction a significant tool in surgical planning and navigation. This work introduces a robotic arm platform for efficient multi-view image acquisition and precise 3D reconstruction in MIS settings. We adapted a laparoscope to a robotic arm and captured ex-vivo images of several ovine organs across varying lighting conditions (operating room and laparoscopic) and trajectories (spherical and laparoscopic). We employed recently released learning-based feature matchers combined with COLMAP to produce our reconstructions. The reconstructions were evaluated against high-precision laser scans for quantitative evaluation. Our results show that whilst reconstructions suffer most under realistic MIS lighting and trajectory, many versions of our pipeline achieve close to sub-millimetre accuracy with an average of 1.05 mm Root Mean Squared Error and 0.82 mm Chamfer distance. Our best reconstruction results occur with operating room lighting and spherical trajectories. Our robotic platform provides a tool for controlled, repeatable multi-view data acquisition for 3D generation in MIS environments which we hope leads to new datasets for training learning-based models.

artificial intelligence, image understanding, reconstruction, (16 more...)

arXiv.org Artificial Intelligence

2410.11703

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.46)

Add feedback

Think Step by Step: Chain-of-Gesture Prompting for Error Detection in Robotic Surgical Videos

Shao, Zhimin, Xu, Jialang, Stoyanov, Danail, Mazomenos, Evangelos B., Jin, Yueming

arXiv.org Artificial IntelligenceJun-27-2024

Despite significant advancements in robotic systems and surgical data science, ensuring safe and optimal execution in robot-assisted minimally invasive surgery (RMIS) remains a complex challenge. Current surgical error detection methods involve two parts: identifying surgical gestures and then detecting errors within each gesture clip. These methods seldom consider the rich contextual and semantic information inherent in surgical videos, limiting their performance due to reliance on accurate gesture identification. Motivated by the chain-of-thought prompting in natural language processing, this letter presents a novel and real-time end-to-end error detection framework, Chain-of-Thought (COG) prompting, leveraging contextual information from surgical videos. This encompasses two reasoning modules designed to mimic the decision-making processes of expert surgeons. Concretely, we first design a Gestural-Visual Reasoning module, which utilizes transformer and attention architectures for gesture prompting, while the second, a Multi-Scale Temporal Reasoning module, employs a multi-stage temporal convolutional network with both slow and fast paths for temporal information extraction. We extensively validate our method on the public benchmark RMIS dataset JIGSAWS. Our method encapsulates the reasoning processes inherent to surgical activities enabling it to outperform the state-of-the-art by 4.6% in F1 score, 4.6% in Accuracy, and 5.9% in Jaccard index while processing each frame in 6.69 milliseconds on average, demonstrating the great potential of our approach in enhancing the safety and efficacy of RMIS procedures and surgical education. The code will be available.

detection, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.19217

Country:

Asia > Singapore (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Surgery (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

SEDMamba: Enhancing Selective State Space Modelling with Bottleneck Mechanism and Fine-to-Coarse Temporal Fusion for Efficient Error Detection in Robot-Assisted Surgery

Xu, Jialang, Sirajudeen, Nazir, Boal, Matthew, Francis, Nader, Stoyanov, Danail, Mazomenos, Evangelos

arXiv.org Artificial IntelligenceJun-22-2024

Automated detection of surgical errors can improve robotic-assisted surgery. Despite promising progress, existing methods still face challenges in capturing rich temporal context to establish long-term dependencies while maintaining computational efficiency. In this paper, we propose a novel hierarchical model named SEDMamba, which incorporates the selective state space model (SSM) into surgical error detection, facilitating efficient long sequence modelling with linear complexity. SEDMamba enhances selective SSM with bottleneck mechanism and fine-to-coarse temporal fusion (FCTF) to detect and temporally localize surgical errors in long videos. The bottleneck mechanism compresses and restores features within their spatial dimension, thereby reducing computational complexity. FCTF utilizes multiple dilated 1D convolutional layers to merge temporal information across diverse scale ranges, accommodating errors of varying durations. Besides, we deploy an established observational clinical human reliability assessment tool (OCHRA) to annotate the errors of suturing tasks in an open-source radical prostatectomy dataset (SAR-RARP50), constructing the first frame-level in-vivo surgical error detection dataset to support error detection in real-world scenarios. Experimental results demonstrate that our SEDMamba outperforms state-of-the-art methods with at least 1.82% AUC and 3.80% AP performance gain with significantly reduced computational complexity.

artificial intelligence, machine learning, sedmamba, (15 more...)

arXiv.org Artificial Intelligence

2406.1592

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.86)

Add feedback