Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension