Large Language Models Can Self-Improve

Huang, Jiaxin, Gu, Shixiang Shane, Hou, Le, Wu, Yuexin, Wang, Xuezhi, Yu, Hongkun, Han, Jiawei

Oct-25-2022–arXiv.org Artificial Intelligence

Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their reasoning abilities by self-thinking without external inputs. In this work, we demonstrate that an LLM is also capable of self-improving with only unlabeled datasets. We use a pre-trained LLM to generate "high-confidence" rationale-augmented answers for unlabeled questions using Chain-of-Thought prompting and self-consistency, and fine-tune the LLM using those self-generated solutions as target outputs. We show that our approach improves the general reasoning ability of a 540B-parameter LLM (74.4% 82.1% on GSM8K, 78.2% 83.0% on DROP, 90.0% 94.4% on OpenBookQA, and 63.4% 67.9% on ANLI-A3) and achieves state-of-the-art-level performance, without any ground truth label. We conduct ablation studies and show that finetuning on reasoning is critical for self-improvement. Scaling has enabled Large Language ...

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

Oct-25-2022

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - Central America (0.04)
  - United States
    - Pennsylvania (0.04)
    - Illinois > Cook County
      - Chicago (0.04)
    - California > San Francisco County
      - San Francisco (0.04)
- Europe > United Kingdom
  - England > Nottinghamshire (0.04)
- Asia
  - Vietnam (0.04)
  - Middle East > Iraq (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Government (1.00)
- Leisure & Entertainment > Sports
  - Football (1.00)
- Health & Medicine > Therapeutic Area
  - Oncology (0.68)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found