The biggest headache in machine learning? Cleaning dirty data off the spreadsheets