Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models