Language Models as Black-Box Optimizers for Vision-Language Models