Transformer Compression via Subspace Projection
Hu, Yuxuan, Zhang, Jing, Zhao, Chen, Li, Cuiping, Chen, Hong
–arXiv.org Artificial Intelligence
We propose TCSP, a novel method for compressing a transformer model by focusing on reducing the hidden size of the model. By projecting the whole transform model into a subspace, we enable matrix operations between the weight matrices in the model and features in a reduced-dimensional space, leading to significant reductions in model parameters and computing resources. To establish this subspace, we decompose the feature matrix, derived from different layers of sampled data instances, into a projection matrix. For evaluation, TCSP is applied to compress T5 and BERT models on the GLUE and SQuAD benchmarks. Experimental results demonstrate that TCSP achieves a compression ratio of 44\% with at most 1.6\% degradation in accuracy, surpassing or matching prior compression methods. Furthermore, TCSP exhibits compatibility with other methods targeting filter and attention head size compression.
arXiv.org Artificial Intelligence
Aug-31-2023
- Country:
- Asia
- China (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Europe > Belgium
- Brussels-Capital Region > Brussels (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Texas > Travis County
- Austin (0.04)
- Virginia (0.04)
- Washington > King County
- Seattle (0.04)
- Minnesota > Hennepin County
- Oceania > Australia
- Asia
- Genre:
- Research Report
- New Finding (0.48)
- Promising Solution (0.34)
- Research Report
- Technology: