XL$^2$Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies

Open in new window