Rethinking Thinking Tokens: LLMs as Improvement Operators