Accelerating Blockwise Parallel Language Models with Draft Refinement

Open in new window