Streaming, Fast and Slow: Cognitive Load-Aware Streaming for Efficient LLM Serving

Open in new window