ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism

Open in new window