VLA Model Post-Training via Action-Chunked PPO and Self Behavior Cloning

Open in new window