Renaissance: Investigating the Pretraining of Vision-Language Encoders