Rebalanced Vision-Language Retrieval Considering Structure-Aware Distillation

Open in new window