Kelle: Co-design KV Caching and eDRAM for Efficient LLM Serving in Edge Computing

Open in new window