Investigating Regularization of Self-Play Language Models