Pruning and Quantization for Deep Neural Network Acceleration: A Survey

Liang, Tailin, Glossner, John, Wang, Lei, Shi, Shaobo

Jan-24-2021–arXiv.org Artificial Intelligence

Deep neural networks have been applied in many applications exhibiting extraordinary abilities in the field of computer vision. However, complex network architectures challenge efficient real-time deployment and require significant computation resources and energy costs. These challenges can be overcome through optimizations such as network compression. This paper provides a survey on two types of network compression: pruning and quantization. We compare current techniques, analyze their strengths and weaknesses, provide guidance for compressing networks, and discuss possible future compression techniques.

accuracy, neural network, quantization, (13 more...)

arXiv.org Artificial Intelligence

Jan-24-2021

arXiv.org PDF

Add feedback

Country:
- Africa > Mali (0.04)
- North America
  - United States
    - New York > New York County
      - New York City (0.14)
    - Massachusetts > Hampshire County
      - Amherst (0.14)
    - California > Santa Clara County
      - Palo Alto (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe
  - Switzerland (0.04)
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
- Asia
  - Japan > Honshū
    - Tōhoku > Fukushima Prefecture > Fukushima (0.04)
  - China > Beijing
    - Beijing (0.04)

Genre:
- Overview (1.00)

Industry:
- Information Technology (1.00)
- Education (1.00)
- Semiconductors & Electronics (0.93)
- Telecommunications > Networks (0.45)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found