Lightweight and Post-Training Structured Pruning for On-Device Large Lanaguage Models

Open in new window