The Peril of Preference: Why GRPO fails on Ordinal Rewards