Endo-CLIP: Progressive Self-Supervised Pre-training on Raw Colonoscopy Records

He, Yili, Zhu, Yan, Fu, Peiyao, Yang, Ruijie, Chen, Tianyi, Wang, Zhihua, Li, Quanlin, Zhou, Pinghong, Yang, Xian, Wang, Shuo

May-15-2025–arXiv.org Artificial Intelligence

Pre-training on image-text colonoscopy records offers substantial potential for improving endoscopic image analysis, but faces challenges including non-informative background images, complex medical terminology, and ambiguous multi-lesion descriptions. We introduce Endo-CLIP, a novel self-supervised framework that enhances Contrastive Language-Image Pre-training (CLIP) for this domain. Endo-CLIP's three-stage framework--cleansing, attunement, and unification--addresses these challenges by: (1) removing background frames, (2) leveraging large language models (LLMs) to extract clinical attributes for fine-grained contrastive learning, and (3) employing patient-level cross-attention to resolve multi-polyp ambiguities. Extensive experiments demonstrate that Endo-CLIP significantly outperforms state-of-the-art pre-training methods in zero-shot and few-shot polyp detection and classification, paving the way for more accurate and clinically relevant endoscopic analysis. Code will be made publicly available on https://github.com/chrlott/

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

May-15-2025

arXiv.org PDF

Add feedback

Country:
- Europe
  - Austria > Vienna (0.04)
  - United Kingdom > England
    - Greater London > London (0.04)
    - Greater Manchester > Manchester (0.04)
- Asia > China
  - Shanghai > Shanghai (0.06)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine
  - Diagnostic Medicine > Imaging (1.00)
  - Therapeutic Area
    - Gastroenterology (1.00)
    - Oncology > Colorectal Cancer (0.73)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found