
Neural Information Processing Systems 

We thank all reviewers for their careful reading and their detailed and constructive comments. We first address the shared reviewer comments and then individual ones. On 5/8 datasets STAR DGP significantly outperforms MF DGP (µ > 0.50 + σ), while the opposite only As suggested by R2, we also compared MF to FC DGP leading to similar results (see new table). Train-test split (R2) We are the first to study the extrapolation behaviour of DGPs. S2 and will move it to the main paper to facilitate comparison to related work.