EndoVLA: Dual-Phase Vision-Language-Action Model for Autonomous Tracking in Endoscopy