f-GRPO and Beyond: Divergence-Based Reinforcement Learning Algorithms for General LLM Alignment