Data Quality
You can bet that you will be asked what kind of data issues you might encounter in your day job during one of your data engineer or data scientist interviews. Data quality will do more for model performance than any other technique. You could train a complicated deep learning model on massive amounts of data, but if the underlying data is bad, so too will the model's inference. In this article, we will attempt to address common data quality issues. As mentioned in the Kaggle tutorial on handling missing values, we need to distinguish between values that are missing because they were not recorded and values that are missing because they don't exist.
Aug-1-2022, 20:10:11 GMT
- Technology: