DiscoveringSparsityAllocationforLayer-wise PruningofLargeLanguageModels

Open in new window