OccVLA: Vision-Language-Action Model with Implicit 3D Occupancy Supervision

Open in new window