The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism