Optimizing Data Collection for Machine Learning