Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs

Wang, Siyuan, Wei, Zhongyu, Choi, Yejin, Ren, Xiang

Jun-20-2024–arXiv.org Artificial Intelligence

Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. To investigate this, we propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic, comprising both primitive and compositional rules across five domains. Our analysis of GPT-series models over a rule subset reveals significant gaps in LLMs' logic understanding compared to human performance, especially in compositional and structural complex rules with certain bias patterns. We further distill these rules into a smaller-scale inference engine for flexible rule generation and enhancing downstream reasoning. Through a multi-judger evaluation, our inference engine proves effective in generating accurate, complex and abstract conclusions and premises, and improve various commonsense reasoning tasks. Overall, our work sheds light on LLMs' limitations in grasping inferential rule and suggests ways to enhance their logical reasoning abilities~\footnote{Code and data are available at \url{https://github.com/SiyuanWangw/ULogic}.}.

conclusion, inferential rule, reasoning, (12 more...)

arXiv.org Artificial Intelligence

Jun-20-2024

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.04)
- Oceania
  - New Zealand (0.04)
  - Australia (0.04)
- North America
  - Canada (0.04)
  - United States
    - California (0.14)
    - New York (0.04)
- Europe > Germany
  - North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning
    - Rule-Based Reasoning (1.00)
    - Expert Systems (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.52)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found