Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications