Evaluating Large Language Models for Causal Modeling