Certifiable Safe RLHF: Fixed-Penalty Constraint Optimization for Safer Language Models