CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents