AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Acceleration

Open in new window