HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention