F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions