ExploringtheLimitsofDomain-AdaptiveTrainingfor DetoxifyingLarge-ScaleLanguageModels
–Neural Information Processing Systems
Wethen comprehensively study detoxifying LMswithparameter sizesranging from126Mupto530B(3 largerthanGPT3), a scale that has never been studied before. We find thati) large LMs have similar toxicity levels as smaller ones given the same pre-training corpus, and ii) large LMs require more endeavor to unlearn the toxic content seen at pretraining. Wealso explore parameter-efficient training methods fordetoxification.
Neural Information Processing Systems
Feb-12-2026, 14:18:02 GMT
- Country:
- North America > United States
- California (0.04)
- Illinois (0.04)
- North America > United States
- Genre:
- Research Report (0.93)
- Technology: