QUIK: Towards End-to-End 4-Bit Inference on Generative Large Language Models