Understanding When Tree of Thoughts Succeeds: Larger Models Excel in Generation, Not Discrimination