HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models