Learning Probability Measures with Respect to Optimal Transport Metrics