Parameter-Efficient Fine-Tuning with Layer Pruning on Free-Text Sequence-to-Sequence Modeling

Zhu, Yunqi, Yang, Xuebing, Wu, Yuanyuan, Zhang, Wensheng

May-18-2023–arXiv.org Artificial Intelligence

The increasing size of language models raises great research interests in parameter-efficient fine-tuning such as LoRA that freezes the pre-trained model, and injects small-scale trainable parameters for multiple downstream tasks (e.g., summarization, question answering and translation). To further enhance the efficiency of fine-tuning, we propose a framework that integrates LoRA and structured layer pruning. The integrated framework is validated on two created deidentified medical report summarization datasets based on MIMIC-IV-Note and two public medical dialogue datasets. By tuning 0.6% parameters of the original model and pruning over 30% Transformer-layers, our framework can reduce 50% of GPU memory usage and speed up 100% of the training phase, while preserving over 92% generation qualities on free-text sequence-to-sequence tasks.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

May-18-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > Minnesota (0.28)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine
  - Diagnostic Medicine (0.94)
  - Health Care Technology (0.89)
  - Pharmaceuticals & Biotechnology (0.93)
  - Therapeutic Area
    - Cardiology/Vascular Diseases (1.00)
    - Infections and Infectious Diseases (0.94)
    - Pulmonary/Respiratory Diseases (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found