Efficient multi-prompt evaluation of LLMs Felipe Maia Polo

Neural Information Processing Systems 

Most popular benchmarks for comparing LLMs rely on a limited set of prompt templates, which may not fully capture the LLMs' abilities and can affect the

Similar Docs  Excel Report  more

TitleSimilaritySource
None found