Multimodal-Aware Weakly Supervised Metric Learning with Self-weighting Triplet Loss