Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE

Open in new window