Reward-Safety Balance in Offline Safe RL via Diffusion Regularization