D-LLM: A Token Adaptive Computing Resource Allocation Strategy for Large Language Models

Open in new window