Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models