Goto

Collaborating Authors

 decomposed layer


Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization

arXiv.org Artificial Intelligence

Low Rank Decomposition (LRD) is a model compression technique applied to the weight tensors of deep learning models in order to reduce the number of trainable parameters and computational complexity. However, due to high number of new layers added to the architecture after applying LRD, it may not lead to a high training/inference acceleration if the decomposition ranks are not small enough. The issue is that using small ranks increases the risk of significant accuracy drop after decomposition. In this paper, we propose two techniques for accelerating low rank decomposed models without requiring to use small ranks for decomposition. These methods include rank optimization and sequential freezing of decomposed layers. We perform experiments on both convolutional and transformer-based models. Experiments show that these techniques can improve the model throughput up to 60% during training and 37% during inference when combined together while preserving the accuracy close to that of the original models.


GroSS: Group-Size Series Decomposition for Whole Search-Space Training

arXiv.org Machine Learning

GroSS allows for dynamic and differentiable selection of factorisation rank, which is analogous to a grouped convolution. Therefore, to the best of our knowledge, GroSS is the first method to simultaneously train differing numbers of groups within a single layer, as well as all possible combinations between layers. In doing so, GroSS trains an entire grouped convolution architecture search-space concurrently. We demonstrate this through proof-of-concept architecture searches with performance objectives. GroSS represents a significant step towards liberating network architecture search from the burden of training and fine-tuning. Generally, these methods have usually involved careful network design, often relying on domain knowledge to design a structure which can encapsulate the task at hand. Neural Architecture Search (NAS) has provided an alternative to hand designed networks, allowing for the search and even direct optimisation of the network's structure. But, the search space for architectures is often vast, with potentially limitless design choices. Furthermore, each configuration must undergo some training or fine-tuning for its efficacy to be determined.