Can This Tiny Language Model Defeat Gigantic GPT3?
While GPT-3 has been bragging about achieving state-of-the-art performance on Complex NLP tasks with hundred billion parameters, researchers from the LMU Munich, Germany have proposed a language model who can show similar achievements with way fewer parameters. GPT-3 has been trained on 175 billion parameters and thus showed remarkable few-shot abilities, and by reformulating a few tasks and prompting inputs, it also showed immense capabilities on SuperGLUE benchmark. However it comes with two most significant drawbacks -- large models aren't always feasible for real-world scenarios, and with the context window of these monstrous models is limited to a few hundred tokens, it doesn't scale more than a few examples. And thus, the researchers proposed an alternative to priming, i.e. PET required unlabelled data, which is easier to gather than labelled data, thus making it usable for real-world applications.
Sep-26-2020, 20:40:41 GMT