Limits of Emergent Reasoning of Large Language Models in Agentic Frameworks for Deterministic Games