Knowledge Distillation -- Make your neural networks smaller
Deploying a huge model with millions of parameters is not easy, what if we could transfer the knowledge to a smaller model and use it for inference? At the end of this article you will understand how it can be done. Knowledge Distillation means transferring the knowledge of a bigger model to a smaller one with the minimum loss of information. It can also refer to transferring the knowledge of multiple models (ensemble) into a single one. In most of the cases, we use the same model for training and inference.
Jan-16-2023, 23:05:06 GMT
- Technology: