tinyBenchmarks: evaluating LLMs with fewer examples