Fast Distributed Inference Serving for Large Language Models