Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

Open in new window