Tokyo U & Google Brain Train Large Language Models as Zero-Shot Reasoners

#artificialintelligence 

Pretrained large language models (LLMs) are now scaled to more than 100B parameters and have revolutionized the field of natural language processing (NLP) with their excellent few-shot and zero-shot learning capabilities. However, although state-of-the-art LLMs make short work of system-1 tasks, they still struggle on system-2 tasks that require slow and multi-task reasoning. A research team from the University of Tokyo and Google Brain addresses this deficiency in their new paper Large Language Models are Zero-Shot Reasoners, which demonstrates that LLMs can become decent zero-shot reasoners through the addition of a simple prompt -- "Let's think step by step" -- that motivates a step-by-step thinking process before each question is answered. Their resulting Zero-shot-CoT (chain of thought prompting) model achieves huge performance gains compared to the zero-shot baseline. The division of human thinking into fast/automatic (system-1) and slow/rational (system-2) processes was proposed in the 2011 bestseller Thinking, Fast and Slow by psychologist Daniel Kahneman and has been widely adopted by machine learning researchers seeking to endow their models with more advanced and humanlike reasoning capabilities.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found