EFIM: Efficient Serving of LLMs for Infilling Tasks with Improved KV Cache Reuse