On Calibration of LLM-based Guard Models for Reliable Content Moderation

Open in new window