LLM-BIP: Structured Pruning for Large Language Models with Block-Wise Forward Importance Propagation

Open in new window