SpecPV: Improving Self-Speculative Decoding for Long-Context Generation via Partial Verification

Open in new window