CARE: Decoding-Time Safety Alignment via Rollback and Introspection Intervention

Neural Information Processing Systems 

As large language models (LLMs) are increasingly deployed in real-world applications, ensuring the safety of their outputs during decoding has become a critical challenge.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found