TabR1: Taming GRPO for tabular reasoning LLMs

Open in new window