Mitigating Covertly Unsafe Text within Natural Language Systems