XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

Open in new window