Aggregation of Dependent Expert Distributions in Multimodal Variational Autoencoders