The Devil is in the Details: A Deep Dive into the Rabbit Hole of Data Filtering