Reviews: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

Oct-7-2024, 23:21:55 GMT–Neural Information Processing Systems

The paper proposes a new method for using unlabeled data in semi-supervised learning. The idea is to construct a teacher network from student network during training by using an exponentially decaying moving average of the weights of the student network, updating after each batch. This is inspired by previous work that uses a temporal ensemble of the softmax outputs, and aims to reduce the variance of the targets during training. Noise of various forms is added to both labelled and unlabeled examples, and a L2 penalty is added to encourage the student outputs to be consistent with the teachers. As the authors mention, this acts as a kind of soft adaptive label propagation mechanism. The advantage of their approach over temporal ensembling is that it can be used in the online setting.

better role model, mean teacher, semi-supervised deep learning result, (7 more...)

Neural Information Processing Systems

Oct-7-2024, 23:21:55 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Unsupervised or Indirectly Supervised Learning (0.59)
  - Neural Networks > Deep Learning (0.40)