MiniKV: Pushing the Limits of LLM Inference via 2-Bit Layer-Discriminative KV Cache