Iterative Layer-wise Distillation for Efficient Compression of Large Language Models

Open in new window