CLOMO: Counterfactual Logical Modification with Large Language Models

Huang, Yinya, Hong, Ruixin, Zhang, Hongming, Shao, Wei, Yang, Zhicheng, Yu, Dong, Zhang, Changshui, Liang, Xiaodan, Song, Linqi

Nov-29-2023–arXiv.org Artificial Intelligence

In our study, we delve into the realm of evaluating Despite large language models (Arkoudas, 2023; large language models' (LLMs) ability to generate OpenAI, 2022) perform strikingly in plenty of reasoning counterfactually coherent thoughts. Specifically, benchmarks (Cobbe et al., 2021; Hendrycks we proposed an innovative evaluation system et al., 2021a), late studies observe an internal inconsistency that quantitatively measures the evolution of information in their reasoning processes (Saparov and in statement pairs, ensuring that they adhere He, 2023; Arkoudas, 2023). The inconsistency is to a specified logical relationship. Our approach attributed to misunderstanding and misapplication includes designing a specialized task where models of logical relations. However, logical relations in are presented with mismatched argument-premise complex language reasoning are not yet properly pairs bound by a specific logical relation. The objective quantified and evaluated.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Nov-29-2023

arXiv.org PDF

Add feedback

Country:
- Asia
  - China > Guangdong Province (0.14)
  - Middle East > UAE (0.14)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.14)

Genre:
- Research Report > New Finding (0.87)

Industry:
- Energy > Oil & Gas (0.47)
- Government (0.95)
- Health & Medicine (0.95)
- Media > News (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.90)
  - Natural Language > Large Language Model (1.00)