DP-LLM: Runtime Model Adaptation with Dynamic Layer-wise Precision Assignment

Open in new window