Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and Toolkit