DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale