Optimizing Language Models for Human Preferences is a Causal Inference Problem