A tutorial note on collecting simulated data for vision-language-action models

Open in new window