Goto

Collaborating Authors

 inferential rule


Beyond Instruction Following: Evaluating Rule Following of Large Language Models

arXiv.org Artificial Intelligence

Although Large Language Models (LLMs) have demonstrated strong instruction-following ability to be helpful, they are further supposed to be controlled and guided by rules in real-world scenarios to be safe, and accurate in responses. This demands the possession of rule-following capability of LLMs. However, few works have made a clear evaluation of the rule-following capability of LLMs. Previous studies that try to evaluate the rule-following capability of LLMs fail to distinguish the rule-following scenarios from the instruction-following scenarios. Therefore, this paper first makes a clarification of the concept of rule-following, and curates a comprehensive benchmark, RuleBench, to evaluate a diversified range of rule-following abilities. Our experimental results on a variety of LLMs show that they are still limited in following rules. Our further analysis provides insights into the improvements for LLMs toward a better rule-following intelligent agent. The data and code can be found at: https://anonymous.4open.science/r/llm-rule-following-B3E3/


Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs

arXiv.org Artificial Intelligence

Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. To investigate this, we propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic, comprising both primitive and compositional rules across five domains. Our analysis of GPT-series models over a rule subset reveals significant gaps in LLMs' logic understanding compared to human performance, especially in compositional and structural complex rules with certain bias patterns. We further distill these rules into a smaller-scale inference engine for flexible rule generation and enhancing downstream reasoning. Through a multi-judger evaluation, our inference engine proves effective in generating accurate, complex and abstract conclusions and premises, and improve various commonsense reasoning tasks. Overall, our work sheds light on LLMs' limitations in grasping inferential rule and suggests ways to enhance their logical reasoning abilities~\footnote{Code and data are available at \url{https://github.com/SiyuanWangw/ULogic}.}.


THE AGE OF INTELLIGENT MACHINES The Social Impact of Artificial Intelligence

AITopics Original Links

Is artificial intelligence in human society a utopian dream or a Faustian nightmare? Will our descendants honor us for making machines do things that human minds do or berate us for irresponsibility and hubris? Either of these judgments might be made of us, for like most human projects this infant technology is ambivalent. Just which aspects of its potential are realized will depend largely on social and political factors. Although these are not wholly subject to deliberate control, they can be influenced by human choice and public opinion.