sam
A Single-Step, Sharpness-Aware Minimization is All You Need to Achieve Efficient and Accurate Sparse Training
Sparse training stands as a landmark approach in addressing the considerable training resource demands imposed by the continuously expanding size of Deep Neural Networks (DNNs). However, the training of a sparse DNN encounters great challenges in achieving optimal generalization ability despite the efforts from the state-of-the-art sparse training methodologies. To unravel the mysterious reason behind the difficulty of sparse training, we connect the network sparsity with neural loss functions structure, and identify the cause of such difficulty lies in chaotic loss surface. In light of such revelation, we propose $S^{2} - SAM$, characterized by a **S**ingle-step **S**harpness_**A**ware **M**inimization that is tailored for **S**parse training.
- North America > Canada > Ontario > Toronto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Vision (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Fujian Province > Xiamen (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.93)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- Africa (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- North America > United States (0.46)
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (3 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- North America > United States (0.14)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine (1.00)
- Information Technology (0.67)
- Education > Educational Setting (0.45)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)