GROOT: Corrective Reward Optimization for Generative Sequential Labeling