AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Open in new window