AVSS: Layer Importance Evaluation in Large Language Models via Activation Variance-Sparsity Analysis

Song, Zichen, Wu, Yuxin, Huang, Sitan, Kang, Zhongfeng

Nov-4-2024–arXiv.org Artificial Intelligence

Additionally, Zopf et al. [2] introduced The evaluation of layer importance in deep learning has been an Layer-wise Relevance Propagation (LRP), including its variants, active area of research, with significant implications for model to analyze the flow of information in complex neural networks, optimization and interpretability. Recently, large language models providing a more nuanced understanding of each layer's contribution (LLMs) have gained prominence across various domains, yet limited to the model's decisions. Furthermore, the work of Mencía studies have explored the functional importance and performance et al. [12] highlighted the significance of Contextual Importance contributions of individual layers within LLMs, especially from Measures(CIM), which integrate contextual information to dynamically the perspective of activation distribution. In this work, we propose evaluate the importance of each layer based on specific input the Activation Variance-Sparsity Score (AVSS), a novel metric conditions, thus overcoming the limitations of static assessment combining normalized activation variance and sparsity to assess methods. However, these approaches often struggle to fully capture each layer's contribution to model performance. By identifying and the intricate activation distributions and redundancy within large removing approximately the lowest 25% of layers based on AVSS, language models, limiting their effectiveness in identifying less we achieve over 90% of original model performance across tasks critical layers.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

Nov-4-2024

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Gansu Province (0.18)
- North America > United States
  - Texas (0.14)

Genre:
- Research Report > Promising Solution (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.49)
  - Natural Language > Large Language Model (1.00)