LongForm: Optimizing Instruction Tuning for Long Text Generation with Corpus Extraction