What Are They Filtering Out? A Survey of Filtering Strategies for Harm Reduction in Pretraining Datasets

Open in new window