Hierarchical Pretraining on Multimodal Electronic Health Records