Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Liang, Tailin, Glossner, John, Wang, Lei, Shi, Shaobo
–arXiv.org Artificial Intelligence
Deep neural networks have been applied in many applications exhibiting extraordinary abilities in the field of computer vision. However, complex network architectures challenge efficient real-time deployment and require significant computation resources and energy costs. These challenges can be overcome through optimizations such as network compression. This paper provides a survey on two types of network compression: pruning and quantization. We compare current techniques, analyze their strengths and weaknesses, provide guidance for compressing networks, and discuss possible future compression techniques.
arXiv.org Artificial Intelligence
Jan-24-2021
- Country:
- Africa > Mali (0.04)
- North America
- United States
- New York > New York County
- New York City (0.14)
- Massachusetts > Hampshire County
- Amherst (0.14)
- California > Santa Clara County
- Palo Alto (0.04)
- New York > New York County
- Canada > Ontario
- Toronto (0.14)
- United States
- Europe
- Switzerland (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Asia
- Genre:
- Overview (1.00)
- Industry:
- Information Technology (1.00)
- Education (1.00)
- Semiconductors & Electronics (0.93)
- Telecommunications > Networks (0.45)
- Technology: