GIFT: Generative Interpretable Fine-Tuning Transformers
Savadikar, Chinmay, Song, Xi, Wu, Tianfu
–arXiv.org Artificial Intelligence
We present GIFT (Generative Interpretable Fine-tuning Transformers) for fine-tuning pretrained (often large) Transformer models at downstream tasks in a parameter-efficient way with built-in interpretability. Our GIFT is a deep parameter-residual learning method, which addresses two problems in fine-tuning a pretrained Transformer model: Where to apply the parameter-efficient fine-tuning (PEFT) to be extremely lightweight yet sufficiently expressive, and How to learn the PEFT to better exploit the knowledge of the pretrained model in a direct way? For the former, we select the final projection (linear) layer in the multi-head self-attention of a Transformer model, and verify its effectiveness. For the latter, in contrast to the prior art that directly introduce new model parameters (often in low-rank approximation form) to be learned in fine-tuning with downstream data, we propose a method for learning to generate the fine-tuning parameters. Our GIFT is a hyper-Transformer which take as input the pretrained parameters of the projection layer to generate its fine-tuning parameters using a proposed Parameter-to-Cluster Attention (PaCa). The PaCa results in a simple clustering-based forward explainer that plays the role of semantic segmentation in testing. In experiments, our proposed GIFT is tested on the VTAB benchmark and the fine-grained visual classification (FGVC) benchmark. It obtains significantly better performance than the prior art. Our code is available at https://github.com/savadikarc/gift
arXiv.org Artificial Intelligence
Dec-1-2023
- Country:
- Africa > Rwanda
- Asia
- China (0.04)
- India (0.04)
- Middle East > Israel
- Tel Aviv District > Tel Aviv (0.04)
- Europe
- Austria (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- North America
- Canada > British Columbia
- Dominican Republic (0.04)
- United States
- California
- Los Angeles County > Long Beach (0.14)
- San Francisco County > San Francisco (0.14)
- Colorado > El Paso County
- Colorado Springs (0.04)
- District of Columbia > Washington (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Maryland > Baltimore (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- North Carolina (0.04)
- California
- Genre:
- Research Report (0.50)
- Industry:
- Health & Medicine (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (0.67)
- Statistical Learning (0.68)
- Natural Language (1.00)
- Vision (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence