Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies