AITopics | capsule endoscopy

Collaborating Authors

capsule endoscopy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CapsDT: Diffusion-Transformer for Capsule Robot Manipulation

He, Xiting, Su, Mingwu, Jiang, Xinqi, Bai, Long, Lai, Jiewen, Ren, Hongliang

arXiv.org Artificial IntelligenceJun-23-2025

-- Vision-Language-Action (VLA) models have emerged as a prominent research area, showcasing significant potential across a variety of applications. However, their performance in endoscopy robotics, particularly endoscopy capsule robots that perform actions within the digestive system, remains unexplored. The integration of VLA models into endoscopy robots allows more intuitive and efficient interactions between human operators and medical devices, improving both diagnostic accuracy and treatment outcomes. By processing interleaved visual inputs, and textual instructions, CapsDT can infer corresponding robotic control signals to facilitate endoscopy tasks. In addition, we developed a capsule endoscopy robot system, a capsule robot controlled by a robotic arm-held magnet, addressing different levels of four endoscopy tasks and creating corresponding capsule robot datasets within the stomach simulator . Comprehensive evaluations on various robotic tasks indicate that CapsDT can serve as a robust vision-language generalist, achieving state-of-the-art performance in various levels of endoscopy tasks while achieving a 26.25% success rate in real-world simulation manipulation. I. INTRODUCTION Endoscopy, for both diagnostic and therapeutic interventions, provides direct visualization and treatment capabilities within the gastrointestinal (GI) tract [1], [2], [3].

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.16263

Country:

Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > Macao (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

eNCApsulate: NCA for Precision Diagnosis on Capsule Endoscopes

Krumb, Henry John, Mukhopadhyay, Anirban

arXiv.org Artificial IntelligenceMay-1-2025

Wireless Capsule Endoscopy is a non-invasive imaging method for the entire gastrointestinal tract, and is a pain-free alternative to traditional endoscopy. It generates extensive video data that requires significant review time, and localizing the capsule after ingestion is a challenge. Techniques like bleeding detection and depth estimation can help with localization of pathologies, but deep learning models are typically too large to run directly on the capsule. Neural Cellular Automata (NCA) for bleeding segmentation and depth estimation are trained on capsule endoscopic images. For monocular depth estimation, we distill a large foundation model into the lean NCA architecture, by treating the outputs of the foundation model as pseudo ground truth. We then port the trained NCA to the ESP32 microcontroller, enabling efficient image processing on hardware as small as a camera capsule. NCA are more accurate (Dice) than other portable segmentation models, while requiring more than 100x fewer parameters stored in memory than other small-scale models. The visual results of NCA depth estimation look convincing, and in some cases beat the realism and detail of the pseudo ground truth. Runtime optimizations on the ESP32-S3 accelerate the average inference speed significantly, by more than factor 3. With several algorithmic adjustments and distillation, it is possible to eNCApsulate NCA models into microcontrollers that fit into wireless capsule endoscopes. This is the first work that enables reliable bleeding segmentation and depth estimation on a miniaturized device, paving the way for precise diagnosis combined with visual odometry as a means of precise localization of the capsule -- on the capsule.

artificial intelligence, image understanding, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.21562

Country: Europe > Germany > Hesse (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

In vivo validation of Wireless Power Transfer System for Magnetically Controlled Robotic Capsule Endoscopy

Catania, Alessandro, Bertozzi, Michele, Greenidge, Nikita J., Calme, Benjamin, Bandini, Gabriele, Sbrana, Christian, Cecchi, Roberto, Buffi, Alice, Strangio, Sebastiano, Valdastri, Pietro, Iannaccone, Giuseppe

arXiv.org Artificial IntelligenceMar-17-2025

This paper presents the in vivo validation of an inductive wireless power transfer (WPT) system integrated for the first time into a magnetically controlled robotic capsule endoscopy platform. The proposed system enables continuous power delivery to the capsule without the need for onboard batteries, thus extending operational time and reducing size constraints. The WPT system operates through a resonant inductive coupling mechanism, based on a transmitting coil mounted on the end effector of a robotic arm that also houses an external permanent magnet and a localization coil for precise capsule manipulation. To ensure robust and stable power transmission in the presence of coil misalignment and rotation, a 3D receiving coil is integrated within the capsule. Additionally, a closed-loop adaptive control system, based on load-shift keying (LSK) modulation, dynamically adjusts the transmitted power to optimize efficiency while maintaining compliance with specific absorption rate (SAR) safety limits. The system has been extensively characterized in laboratory settings and validated through in vivo experiments using a porcine model, demonstrating reliable power transfer and effective robotic navigation in realistic gastrointestinal conditions: the average received power was 110 mW at a distance of 9 cm between the coils, with variable capsule rotation angles. The results confirm the feasibility of the proposed WPT approach for autonomous, battery-free robotic capsule endoscopy, paving the way for enhanced diagnostic in gastrointestinal medicine.

artificial intelligence, capsule, coil, (15 more...)

arXiv.org Artificial Intelligence

2503.1285

Country:

Europe > Italy > Tuscany > Pisa Province > Pisa (0.05)
Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)
Europe > Austria > Upper Austria > Linz (0.04)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Influence of color correction on pathology detection in Capsule Endoscopy

Agossou, Bidossessi Emmanuel, Pedersen, Marius, Raja, Kiran, Vats, Anuja, Floor, Pål Anders

arXiv.org Artificial IntelligenceJan-31-2025

Pathology detection in Wireless Capsule Endoscopy (WCE) using deep learning has been explored in the recent past. However, deep learning models can be influenced by the color quality of the dataset used to train them, impacting detection, segmentation and classification tasks. In this work, we evaluate the impact of color correction on pathology detection using two prominent object detection models: Retinanet and YOLOv5. We first generate two color corrected versions of a popular WCE dataset (i.e., SEE-AI dataset) using two different color correction functions. We then evaluate the performance of the Retinanet and YOLOv5 on the original and color corrected versions of the dataset. The results reveal that color correction makes the models generate larger bounding boxes and larger intersection areas with the ground truth annotations. Furthermore, color correction leads to an increased number of false positives for certain pathologies. However, these effects do not translate into a consistent improvement in performance metrics such as F1-scores, IoU, and AP50.

artificial intelligence, color correction, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.00076

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.04)

Genre: Research Report > Experimental Study (0.47)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

V$^2$-SfMLearner: Learning Monocular Depth and Ego-motion for Multimodal Wireless Capsule Endoscopy

Bai, Long, Cui, Beilei, Wang, Liangyu, Li, Yanheng, Yao, Shilong, Yuan, Sishen, Wu, Yanan, Zhang, Yang, Meng, Max Q. -H., Li, Zhen, Ding, Weiping, Ren, Hongliang

arXiv.org Artificial IntelligenceDec-23-2024

Deep learning can predict depth maps and capsule ego-motion from capsule endoscopy videos, aiding in 3D scene reconstruction and lesion localization. However, the collisions of the capsule endoscopies within the gastrointestinal tract cause vibration perturbations in the training data. Existing solutions focus solely on vision-based processing, neglecting other auxiliary signals like vibrations that could reduce noise and improve performance. Therefore, we propose V$^2$-SfMLearner, a multimodal approach integrating vibration signals into vision-based depth and capsule motion estimation for monocular capsule endoscopy. We construct a multimodal capsule endoscopy dataset containing vibration and visual signals, and our artificial intelligence solution develops an unsupervised method using vision-vibration signals, effectively eliminating vibration perturbations through multimodal learning. Specifically, we carefully design a vibration network branch and a Fourier fusion module, to detect and mitigate vibration noises. The fusion framework is compatible with popular vision-only algorithms. Extensive validation on the multimodal dataset demonstrates superior performance and robustness against vision-only algorithms. Without the need for large external equipment, our V$^2$-SfMLearner has the potential for integration into clinical capsule robots, providing real-time and dependable digestive examination tools. The findings show promise for practical implementation in clinical settings, enhancing the diagnostic capabilities of doctors.

artificial intelligence, estimation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.17595

Country:

Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Jiangsu Province > Nanjing (0.04)
(16 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(2 more...)

Add feedback

Domain-Adaptive Pre-training of Self-Supervised Foundation Models for Medical Image Classification in Gastrointestinal Endoscopy

Roth, Marcel, Nowak, Micha V., Krenzer, Adrian, Puppe, Frank

arXiv.org Artificial IntelligenceDec-11-2024

Video capsule endoscopy has transformed gastrointestinal endoscopy (GIE) diagnostics by offering a non-invasive method for capturing detailed images of the gastrointestinal tract, enabling early disease detection. However, its potential is limited by the sheer volume of images generated during the imaging procedure, which can take anywhere from 6-8 hours and often produce up to 1 million images, necessitating automated analysis. Additionally, the variability of these images, combined with the need for expert annotations and the scarcity of large, high-quality labeled datasets, constrains the effectiveness of current medical image analysis models. To address this, we introduce a novel large GIE dataset, called EndoExtend24, created by merging ten existing public and private datasets, ensuring patient integrity across splits. EndoExtend24 includes over 226,000 labeled images, as well as dynamic class mappings, which allow unified training across datasets with differing labeling granularity, supporting up to 123 distinct pathological findings. Further, we propose to leverage domain adaptive pre-training of foundation models trained with self-supervision on generic image data, to adapt them to the task of GIE medical image diagnosis. Specifically, the EVA-02 model, which is based on the ViT architecture and trained on ImageNet-22k with masked image modeling (using EVA-CLIP as a MIM teacher), is pre-trained on the EndoExtend24 dataset to achieve domain adaptation, and finally trained on the Capsule Endoscopy 2024 Challenge dataset. Our model demonstrates robust performance, securing third place in the Capsule Endoscopy 2024 Challenge. We achieved a macro AUC of 0.762 and a balanced accuracy of 37.1% on the test set. These results emphasize the effectiveness of our domain-adaptive pre-training approach and the enriched EndoExtend24 dataset in advancing gastrointestinal endoscopy diagnostics.

balanced accuracy, classification, dataset, (13 more...)

arXiv.org Artificial Intelligence

2410.21302

Country: Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)

Genre:

Research Report (1.00)
Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting

Chłopowiec, Adrian B., Chłopowiec, Adam R., Galus, Krzysztof, Cebula, Wojciech, Tabakov, Martin

arXiv.org Artificial IntelligenceDec-4-2024

Limited medical imaging datasets challenge deep learning models by increasing risks of overfitting and reduced generalization, particularly in Generative Adversarial Networks (GANs), where discriminators may overfit, leading to training divergence. This constraint also impairs classification models trained on small datasets. Generative Data Augmentation (GDA) addresses this by expanding training datasets with synthetic data, although it requires training a generative model. We propose and evaluate two local lesion generation approaches to address the challenge of augmenting small medical image datasets. The first approach employs the Poisson Image Editing algorithm, a classical image processing technique, to create realistic image composites that outperform current state-of-the-art methods. The second approach introduces a novel generative method, leveraging a fine-tuned Image Inpainting GAN to synthesize realistic lesions within specified regions of real training images. A comprehensive comparison of the two proposed methods demonstrates that effective local lesion generation in a data-constrained setting allows for reaching new state-of-the-art results in capsule endoscopy lesion classification. Combination of our techniques achieves a macro F1-score of 33.07%, surpassing the previous best result by 7.84 percentage points (p.p.) on the highly imbalanced Kvasir Capsule Dataset, a benchmark for capsule endoscopy. To the best of our knowledge, this work is the first to apply a fine-tuned Image Inpainting GAN for GDA in medical imaging, demonstrating that an image-conditional GAN can be adapted effectively to limited datasets to generate high-quality examples, facilitating effective data augmentation. Additionally, we show that combining this GAN-based approach with classical image processing techniques further improves the results.

dataset, iida, lesion, (12 more...)

arXiv.org Artificial Intelligence

2411.03098

Country:

Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Trajectory Planning and Control for Robotic Magnetic Manipulation

Isitman, Ogulcan, Alcan, Gokhan, Kyrki, Ville

arXiv.org Artificial IntelligenceNov-22-2024

Robotic magnetic manipulation offers a minimally invasive approach to gastrointestinal examinations through capsule endoscopy. However, controlling such systems using external permanent magnets (EPM) is challenging due to nonlinear magnetic interactions, especially when there are complex navigation requirements such as avoidance of sensitive tissues. In this work, we present a novel trajectory planning and control method incorporating dynamics and navigation requirements, using a single EPM fixed to a robotic arm to manipulate an internal permanent magnet (IPM). Our approach employs a constrained iterative linear quadratic regulator that considers the dynamics of the IPM to generate optimal trajectories for both the EPM and IPM. Extensive simulations and real-world experiments, motivated by capsule endoscopy operations, demonstrate the robustness of the method, showcasing resilience to external disturbances and precise control under varying conditions. The experimental results show that the IPM reaches the goal position with a maximum mean error of 0.18 cm and a standard deviation of 0.21 cm. This work introduces a unified framework for constrained trajectory optimization in magnetic manipulation, directly incorporating both the IPM's dynamics and the EPM's manipulability.

artificial intelligence, optimization problem, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2411.1495

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Finland > Pirkanmaa > Tampere (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (0.89)
Health & Medicine > Diagnostic Medicine > Imaging (0.56)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.34)

Add feedback

Artificial Intelligence in Gastrointestinal Bleeding Analysis for Video Capsule Endoscopy: Insights, Innovations, and Prospects (2008-2023)

Singh, Tanisha, Jha, Shreshtha, Bhatt, Nidhi, Handa, Palak, Goel, Nidhi, Indu, Sreedevi

arXiv.org Artificial IntelligenceSep-1-2024

The escalating global mortality and morbidity rates associated with gastrointestinal (GI) bleeding, compounded by the complexities and limitations of traditional endoscopic methods, underscore the urgent need for a critical review of current methodologies used for addressing this condition. With an estimated 300,000 annual deaths worldwide, the demand for innovative diagnostic and therapeutic strategies is paramount. The introduction of Video Capsule Endoscopy (VCE) has marked a significant advancement, offering a comprehensive, non-invasive visualization of the digestive tract that is pivotal for detecting bleeding sources unattainable by traditional methods. Despite its benefits, the efficacy of VCE is hindered by diagnostic challenges, including time-consuming analysis and susceptibility to human error. This backdrop sets the stage for exploring Machine Learning (ML) applications in automating GI bleeding detection within capsule endoscopy, aiming to enhance diagnostic accuracy, reduce manual labor, and improve patient outcomes. Through an exhaustive analysis of 113 papers published between 2008 and 2023, this review assesses the current state of ML methodologies in bleeding detection, highlighting their effectiveness, challenges, and prospective directions. It contributes an in-depth examination of AI techniques in VCE frame analysis, offering insights into open-source datasets, mathematical performance metrics, and technique categorization. The paper sets a foundation for future research to overcome existing challenges, advancing gastrointestinal diagnostics through interdisciplinary collaboration and innovation in ML applications.

classification, detection, vce frame, (14 more...)

arXiv.org Artificial Intelligence

2409.00639

Country:

Europe > Austria (0.14)
Asia > India > NCT > Delhi (0.04)
North America > United States (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy

Bai, Long, Chen, Tong, Tan, Qiaozhi, Nah, Wan Jun, Li, Yanheng, He, Zhicheng, Yuan, Sishen, Chen, Zhen, Wu, Jinlin, Islam, Mobarakol, Li, Zhen, Liu, Hongbin, Ren, Hongliang

arXiv.org Artificial IntelligenceJul-8-2024

Wireless Capsule Endoscopy (WCE) is highly valued for its non-invasive and painless approach, though its effectiveness is compromised by uneven illumination from hardware constraints and complex internal dynamics, leading to overexposed or underexposed images. While researchers have discussed the challenges of low-light enhancement in WCE, the issue of correcting for different exposure levels remains underexplored. To tackle this, we introduce EndoUIC, a WCE unified illumination correction solution using an end-to-end promptable diffusion transformer (DiT) model. In our work, the illumination prompt module shall navigate the model to adapt to different exposure levels and perform targeted image enhancement, in which the Adaptive Prompt Integration (API) and Global Prompt Scanner (GPS) modules shall further boost the concurrent representation learning between the prompt parameters and features. Besides, the U-shaped restoration DiT model shall capture the long-range dependencies and contextual information for unified illumination restoration. Moreover, we present a novel Capsule-endoscopy Exposure Correction (CEC) dataset, including ground-truth and corrupted image pairs annotated by expert photographers. Extensive experiments against a variety of state-of-the-art (SOTA) methods on four datasets showcase the effectiveness of our proposed method and components in WCE illumination restoration, and the additional downstream experiments further demonstrate its utility for clinical diagnosis and surgical assistance.

dataset, endouic, unified illumination correction, (14 more...)

arXiv.org Artificial Intelligence

2406.13705

Country:

Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.05)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
(6 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback