CAST: Counterfactual Labels Improve Instruction Following in Vision-Language-Action Models

Open in new window