ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments