Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation