LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation

Open in new window