ReGA: Representation-Guided Abstraction for Model-based Safeguarding of LLMs

Open in new window