vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention

Open in new window