Realizing LLMs' Causal Potential Requires Science-Grounded, Novel Benchmarks