A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models

Dec-24-2025, 13:05:32 GMT–Neural Information Processing Systems

Despite the remarkable success of pre-trained language models (PLMs), they still face two challenges: First, large-scale PLMs are inefficient in terms of memory footprint and computation. Second, on the downstream tasks, PLMs tend to rely on the dataset bias and struggle to generalize to out-of-distribution (OOD) data. In response to the efficiency problem, recent studies show that dense PLMs can be replaced with sparse subnetworks without hurting the performance. Such subnetworks can be found in three scenarios: 1) the fine-tuned PLMs, 2) the raw PLMs and then fine-tuned in isolation, and even inside 3) PLMs without any parameter fine-tuning. However, these results are only obtained in the in-distribution (ID) setting.

robust pre-trained language model, subnetwork, win-win deal, (9 more...)

Neural Information Processing Systems

Dec-24-2025, 13:05:32 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report > New Finding (0.55)

Technology:
- Information Technology > Artificial Intelligence > Natural Language (0.78)