NIRVANA: Structured pruning reimagined for large language models compression

Open in new window