DRONE: Data-aware Low-rank Compression for Large NLP Models

Dec-25-2025, 06:33:05 GMT–Neural Information Processing Systems

The representations learned by large-scale NLP models such as BERT have been widely used in various tasks. However, the increasing model size of the pre-trained models also brings efficiency challenges, including inference speed and model size when deploying models on mobile devices. Specifically, most operations in BERT consist of matrix multiplications. These matrices are not low-rank and thus canonical matrix decomposition could not find an efficient approximation. In this paper, we observe that the learned representation of each layer lies in a low-dimensional space.

data-aware low-rank compression, drone, name change, (10 more...)

Neural Information Processing Systems

Dec-25-2025, 06:33:05 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language (0.42)