Understanding Contrastive Learning via Distributionally Robust Optimization