AGFT: An Adaptive GPU Frequency Tuner for Real-Time LLM Inference Optimization

Open in new window