GRPO-$λ$: Credit Assignment improves LLM Reasoning