When Are Two Scores Better Than One? Investigating Ensembles of Diffusion Models