Landscape of Sparse Linear Network: A Brief Investigation

Lin, Dachao, Sun, Ruoyu, Zhang, Zhihua

arXiv.org Machine Learning 

Deep neural networks have achieved remarkable empirical successes in the domains of computer vision, speech recognition, and natural language processing, sparking an interest in the theory behind their architectures and training. However, deep neural networks are often found to be highly overparameterized making them computationally expensive with large amounts of memory and computational power, which may take up to weeks on a modern multi-GPU server for large datasets such as ImageNet Deng et al. [7]. Hence, they are often unsuitable for smaller devices like embedded electronics, and there is a pressing demand for techniques to optimize models with reduced model size, faster inference and lower power consumption. Sparse networks, that is, neural networks in which a large subset of the model parameters are zero, have emerged as one of the leading approaches for reducing model parameter count. It has been shown empirically that deep neural networks can achieve state-of-the-art results under high levels of sparsity [17, 14, 22].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found