Rule Based Rewards for Language Model Safety