Position: Don't use the CLT in LLM evals with fewer than a few hundred datapoints

Open in new window