Beyond Factual Accuracy: Evaluating Coverage of Diverse Factual Information in Long-form Text Generation