Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA