Toward Effective Reinforcement Learning Fine-Tuning for Medical VQA in Vision-Language Models