WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference

Open in new window