(Self-)Supervised Pre-training? Self-training? Which one to use?

#artificialintelligence 

Recently, pre-training has been a hot topic in Computer Vision (and also NLP), especially one of the breakthroughs in NLP -- BERT, which proposed a method to train an NLP model by using a "self-supervised" signal. In short, we come up with an algorithm that can generate a "pseudo-label" itself (meaning a label that is true for a specific task), then we treat the learning task as a supervised learning task with the generated pseudo-label. It is commonly called "Pretext Task". For example, BERT uses mask word prediction to train the model (we can then say it is a pre-trained model after it is trained), then fine-tune the model with the task we want (usually called "Downstream Task"), e.g. The mask word prediction is to randomly mask a word in the sentence, and ask the model to predict what is that word given the sentence.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found