LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Open in new window