Model Compression via Pruning

Nov-23-2020, 19:50:57 GMT–#artificialintelligence

To obtain fast and accurate inference on edge devices, a model has to be optimized for real-time inference. Fine-tuned state-of-the-art models like VGG16/19, ResNet50 have 138 million and 23 million parameters respectively and inference is often expensive on resource-constrained devices. Previously I've talked about one model compression technique called "Knowledge Distillation" using a smaller student network to mimic the performance of a larger teacher network (Both student and teacher network has different network architecture). Today, the focus will be on "Pruning" one model compression technique that allows us to compress the model to a smaller size with zero or marginal loss of accuracy. In short, pruning eliminates the weights with low magnitude (That does not contribute much to the final model performance).

mobilenet, pruning, sparsity, (12 more...)

#artificialintelligence

Nov-23-2020, 19:50:57 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Machine Translation (0.31)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found