Block: Balancing Load in LLM Serving with Context, Knowledge and Predictive Scheduling