Beyond Words and Pixels: A Benchmark for Implicit World Knowledge Reasoning in Generative Models