Online Scheduling for LLM Inference with KV Cache Constraints