VLMbench: A Compositional Benchmark for Vision-and-Language Manipulation