Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exiting Fangcheng Liu Yehui Tang Zhenhua Liu Y unsheng Ni Duyu Tang Kai Han, Yunhe Wang

Open in new window