lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models

Open in new window