Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models

Open in new window