Why data remains the greatest challenge for machine learning projects