A Tutorial on LLM Reasoning: Relevant Methods behind ChatGPT o1

Feb-15-2025–arXiv.org Artificial Intelligence

OpenAI o1 has shown that applying reinforcement learning to integrate reasoning steps directly during inference can significantly improve a model's reasoning capabilities. This result is exciting as the field transitions from the conventional autoregressive method of generating answers to a more deliberate approach that models the slow-thinking process through step-by-step reasoning training. Reinforcement learning plays a key role in both the model's training and decoding processes. In this article, we present a comprehensive formulation of reasoning problems and investigate the use of both model-based and model-free approaches to better support this slow-thinking framework.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

Feb-15-2025

arXiv.org PDF

Add feedback

Genre:
- Instructional Material > Course Syllabus & Notes (1.00)

Industry:
- Education (0.50)
- Leisure & Entertainment > Games (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Cognitive Science > Problem Solving (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found