Safeguarding Large Language Models in Real-time with Tunable Safety-Performance Trade-offs