Enabling Efficient Serverless Inference Serving for LLM (Large Language Model) in the Cloud