NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples