Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents Zihao Wang

Open in new window