Dialogue Without Limits: Constant-Sized KV Caches for Extended Responses in LLMs