CHAI: Clustered Head Attention for Efficient LLM Inference