SyncVoice: Towards Video Dubbing with Vision-Augmented Pretrained TTS Model