Cerberus: Efficient Inference with Adaptive Parallel Decoding and Sequential Knowledge Enhancement

Open in new window