BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs

Yang, Junxiao, Tu, Jinzhe, Liu, Haoran, Wang, Xiaoce, Zheng, Chujie, Zhang, Zhexin, Cui, Shiyao, Chen, Caishun, He, Tiantian, Wang, Hongning, Ong, Yew-Soon, Huang, Minlie

May-21-2025–arXiv.org Artificial Intelligence

Recent advances in Large Reasoning Models (LRMs) have shown impressive capabilities in mathematical and logical reasoning. However, current LRMs rarely admit ignorance or respond with "I don't know". Instead, they often produce incorrect answers while showing undue confidence, raising concerns about their factual reliability. In this work, we identify two pathological reasoning patterns characterized by overthinking that contribute to the overconfident and incorrect answers: last-minute guessing and second-thought spiraling. To address these issues, we propose BARREL-a novel framework that promotes concise and boundary-aware factual reasoning. Our experiments show that BARREL-training increases the reliability of DeepSeek-R1-Distill-Llama-8B from 39.33% to 61.48%, while still achieving accuracy comparable to models finetuned on reasoning data generated by R1. These results demonstrate that our pilot study is inspiring to build more reliable and factual System 2 LRMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

May-21-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Colorado (0.04)
    - Texas (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
    - Florida > Miami-Dade County
      - Miami (0.04)
    - California > Los Angeles County
      - Los Angeles (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Spain > Canary Islands
    - Tenerife (0.05)
    - Gran Canaria (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
- Asia
  - Singapore (0.04)
  - Middle East > Jordan (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)

Genre:
- Research Report > New Finding (0.87)

Industry:
- Information Technology (0.67)
- Leisure & Entertainment > Sports
  - Football (1.00)

Technology:
- Information Technology
  - Communications > Networks (0.94)
  - Artificial Intelligence
    - Representation & Reasoning (0.93)
    - Natural Language > Large Language Model (0.93)
    - Machine Learning > Neural Networks
      - Deep Learning (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found