Supplementary Material for Text Promptable Surgical Instrument Segmentation with Vision-Language Models Zijian Zhou
–Neural Information Processing Systems
They are used in our experiments section. OpenAI GPT -4 based prompts The input template for OpenAI GPT -4 is defined as: Please describe the appearance of [class_name] in endoscopic surgery, and change the description to a phrase with subject, and not use colons. The dataset consists of both training and test cases. Each video is recorded at 25 FPS and has annotations for instruments and operation phases. For EndoVis2019, the results are shown in Tab. 1, our method (input size 448) notably surpasses the competition's top performers, with +3% increase in DSC and +2% enhancement in NSD, which demonstrates the superiority of our method.
Neural Information Processing Systems
Feb-12-2026, 06:38:17 GMT