Generating Benchmarks for Factuality Evaluation of Language Models

Open in new window