E2Edev: Benchmarking Large Language Models in End-to-End Software Development Task