KL-Regularized Reinforcement Learning is Designed to Mode Collapse