HWPQ: Hessian-free Weight Pruning-Quantization For LLM Compression And Acceleration

Open in new window