SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation

Open in new window