Group-Aware Reinforcement Learning for Output Diversity in Large Language Models