PyramidCLIP: HierarchicalFeatureAlignmentfor Vision-languageModelPretraining AnonymousAuthor(s) Affiliation Address email