wild turkey
LeanReasoner: Boosting Complex Logical Reasoning with Lean
Jiang, Dongwei, Fonseca, Marcio, Cohen, Shay B.
Large language models (LLMs) often struggle with complex logical reasoning due to logical inconsistencies and the inherent difficulty of such reasoning. We use Lean, a theorem proving framework, to address these challenges. By formalizing logical reasoning problems into theorems within Lean, we can solve them by proving or disproving the corresponding theorems. This method reduces the risk of logical inconsistencies with the help of Lean's symbolic solver. It also enhances our ability to treat complex reasoning tasks by using Lean's extensive library of theorem proofs. Our method achieves state-of-the-art performance on the FOLIO dataset and achieves performance near this level on ProofWriter. Notably, these results were accomplished by fine-tuning on fewer than 100 in-domain samples for each dataset.
- Asia > Middle East > Republic of Türkiye (0.08)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
LogiCoT: Logical Chain-of-Thought Instruction-Tuning
Liu, Hanmeng, Teng, Zhiyang, Cui, Leyang, Zhang, Chaoli, Zhou, Qiji, Zhang, Yue
Generative Pre-trained Transformer 4 (GPT-4) demonstrates impressive chain-of-thought reasoning ability. Recent work on self-instruction tuning, such as Alpaca, has focused on enhancing the general proficiency of models. These instructions enable the model to achieve performance comparable to GPT-3.5 on general tasks like open-domain text generation and paraphrasing. However, they fall short of helping the model handle complex reasoning tasks. To bridge the gap, this paper presents LogiCoT, a new instruction-tuning dataset for Logical Chain-of-Thought reasoning with GPT-4. We elaborate on the process of harvesting instructions for prompting GPT-4 to generate chain-of-thought rationales. LogiCoT serves as an instruction set for teaching models of logical reasoning and elicits general reasoning skills.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
FOLIO: Natural Language Reasoning with First-Order Logic
Han, Simeng, Schoelkopf, Hailey, Zhao, Yilun, Qi, Zhenting, Riddell, Martin, Benson, Luke, Sun, Lucy, Zubova, Ekaterina, Qiao, Yujie, Burtell, Matthew, Peng, David, Fan, Jonathan, Liu, Yixin, Wong, Brian, Sailor, Malcolm, Ni, Ansong, Nan, Linyong, Kasai, Jungo, Yu, Tao, Zhang, Rui, Joty, Shafiq, Fabbri, Alexander R., Kryscinski, Wojciech, Lin, Xi Victoria, Xiong, Caiming, Radev, Dragomir
We present FOLIO, a human-annotated, open-domain, and logically complex and diverse dataset for reasoning in natural language (NL), equipped with first order logic (FOL) annotations. FOLIO consists of 1,435 examples (unique conclusions), each paired with one of 487 sets of premises which serve as rules to be used to deductively reason for the validity of each conclusion. The logical correctness of premises and conclusions is ensured by their parallel FOL annotations, which are automatically verified by our FOL inference engine. In addition to the main NL reasoning task, NL-FOL pairs in FOLIO automatically constitute a new NL-FOL translation dataset using FOL as the logical form. Our experiments on FOLIO systematically evaluate the FOL reasoning ability of supervised fine-tuning on medium-sized language models (BERT, RoBERTa) and few-shot prompting on large language models (GPT-NeoX, OPT, GPT-3, Codex). For NL-FOL translation, we experiment with GPT-3 and Codex. Our results show that one of the most capable Large Language Model (LLM) publicly available, GPT-3 davinci, achieves only slightly better than random results with few-shot prompting on a subset of FOLIO, and the model is especially bad at predicting the correct truth values for False and Unknown conclusions. Our dataset and code are available at https://github.com/Yale-LILY/FOLIO.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Dominican Republic (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)