Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher

Feb-11-2026, 00:54:08 GMT–Neural Information Processing Systems

On the other hand, recent finding on neural tangent kernel enables us to approximate a wide neural network with a linear model of the network's random features. In this paper, we theoretically analyze the knowledge distillation of a wide neural network. First we provide a transfer risk bound for the linearized model of the network. Then we propose a metric of the task's training difficulty, called data inefficiency.

artificial intelligence, machine learning, student, (16 more...)

Neural Information Processing Systems

Feb-11-2026, 00:54:08 GMT

Conferences PDF

Add feedback

Country:
- North America > Canada (0.04)
- Asia > China
  - Beijing > Beijing (0.04)

Genre:
- Research Report (0.47)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Duplicate Docs Excel Report

Title
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher

Similar Docs Excel Report more

Title	Similarity	Source
None found