Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization