Energy-Efficient Wireless LLM Inference via Uncertainty and Importance-Aware Speculative Decoding

Open in new window