FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning

Open in new window