Rethinking harmless refusals when fine-tuning foundation models

Open in new window