SDMPrune: Self-Distillation MLP Pruning for Efficient Large Language Models

Open in new window