Stateful Large Language Model Serving with Pensieve
–arXiv.org Artificial Intelligence
Existing LLM serving systems are stateless across In the conversational setup, the user and the chatbot are requests. Consequently, when LLMs are used in the common engaged in a dialogue that may last many rounds. In order setting of multi-turn conversations, a growing log of the conversation for the chatbot not to "lose memory" of what has been said so history must be processed alongside any request far when responding, the cumulative history of the dialogue by the serving system at each turn, resulting in repeated must be part of the context for LLM's autoregressive generation.
arXiv.org Artificial Intelligence
Dec-9-2023
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > Italy
- Calabria > Catanzaro Province > Catanzaro (0.04)
- North America > United States
- New York > New York County > New York City (0.04)
- Oceania > Australia
- Australian Capital Territory > Canberra (0.04)
- Asia > Middle East
- Genre:
- Research Report (0.50)
- Technology: