CAST: Counterfactual Labels Improve Instruction Following in Vision-Language-Action Models