CLIP feature-based randomized control using images and text for multiple tasks and robots