SparQ Attention: Bandwidth-Efficient LLM Inference