Rebalanced Vision-Language Retrieval Considering Structure-Aware Distillation