F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Open in new window