Large Language Models' Reasoning Stalls: An Investigation into the Capabilities of Frontier Models