EndoVLA: Dual-Phase Vision-Language-Action Model for Autonomous Tracking in Endoscopy

Open in new window