Advancing Visual Grounding with Scene Knowledge: Benchmark and Method