Generating Symbolic World Models via Test-time Scaling of Large Language Models