HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Open in new window