HALO: Hardware-aware quantization with low critical-path-delay weights for LLM acceleration