peter clark
NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning
Weir, Nathaniel, Clark, Peter, Van Durme, Benjamin
Our goal is a modern approach to answering questions via systematic reasoning where answers are supported by human interpretable proof trees grounded in an NL corpus of authoritative facts. Such a system would help alleviate the challenges of interpretability and hallucination with modern LMs, and the lack of grounding of current explanation methods (e.g., Chain-of-Thought). This paper proposes a new take on Prolog-based inference engines, where we replace handcrafted rules with a combination of neural language modeling, guided generation, and semiparametric dense retrieval. Our implementation, NELLIE, is the first system to demonstrate fully interpretable, end-to-end grounded QA as entailment tree proof search, going beyond earlier work explaining known-to-be-true facts from text. In experiments, NELLIE outperforms a similar-sized state-of-the-art reasoner [Tafjord et al., 2022] while producing knowledge-grounded explanations. We also find NELLIE can exploit both semi-structured and NL text corpora to guide reasoning. Together these suggest a new way to jointly reap the benefits of both modern neural methods and traditional symbolic reasoning.
FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain
Labrak, Yanis, Bazoge, Adrien, Dufour, Richard, Rouvier, Mickael, Morin, Emmanuel, Daille, Bรฉatrice, Gourraud, Pierre-Antoine
This paper introduces FrenchMedMCQA, the first publicly available Multiple-Choice Question Answering (MCQA) dataset in French for medical domain. It is composed of 3,105 questions taken from real exams of the French medical specialization diploma in pharmacy, mixing single and multiple answers. Each instance of the dataset contains an identifier, a question, five possible answers and their manual correction(s). We also propose first baseline models to automatically process this MCQA task in order to report on the current performances and to highlight the difficulty of the task. A detailed analysis of the results showed that it is necessary to have representations adapted to the medical domain or to the MCQA task: in our case, English specialized models yielded better results than generic French ones, even though FrenchMedMCQA is in French. Corpus, models and tools are available online.
GitHub - jeffhj/LM-reasoning: This repository contains a collection of papers and resources on Reasoning in Large Language Models.
This repository contains a collection of papers and resources on Reasoning in Large Language Models. Feel free to let me know the missing papers (issue or pull request). Thank Kevin Chen-Chuan Chang @UIUC, Jason Wei @Google Brain, Denny Zhou @Google Brain for insightful discussions and suggestions. We mainly focus on techniques that are applicable to improving or eliciting "reasoning" in large language models like GPT-3 (175B) Papers in this paradigm vary a lot and are usually based on small models trained on specific datasets. We list several papers here for reference (that is, the list is not complete).
Towards Teachable Reasoning Systems: Using a Dynamic Memory of User Feedback for Continual System Improvement
Mishra, Bhavana Dalvi, Tafjord, Oyvind, Clark, Peter
Our goal is a teachable reasoning system for question-answering (QA), where a user can interact with faithful answer explanations, and correct its errors so that the system improves over time. Our approach is to augment a QA model with a dynamic memory of user feedback, containing user-supplied corrections to erroneous model beliefs that users identify during interaction. Retrievals from memory are used as additional context for QA, to help avoid previous mistakes in similar new situations - a novel application of memory-based continuous learning. With simulated feedback, we find that our system (called TeachMe) continually improves with time, and without model retraining, requiring feedback on only 25% of training examples to reach within 1% of the upper-bound (feedback on all examples). Similarly, in experiments with real users, we observe a similar trend, with performance improving by over 15% on a hidden test set after teaching. This suggests new opportunities for using frozen language models in an interactive setting where users can inspect, debug, and correct the model's beliefs, leading to improved system's performance over time.
EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering
Hardalov, Momchil, Mihaylov, Todor, Zlatkova, Dimitrina, Dinkov, Yoan, Koychev, Ivan, Nakov, Preslav
We propose EXAMS -- a new benchmark dataset for cross-lingual and multilingual question answering for high school examinations. We collected more than 24,000 high-quality high school exam questions in 16 languages, covering 8 language families and 24 school subjects from Natural Sciences and Social Sciences, among others. EXAMS offers a fine-grained evaluation framework across multiple languages and subjects, which allows precise analysis and comparison of various models. We perform various experiments with existing top-performing multilingual pre-trained models and we show that EXAMS offers multiple challenges that require multilingual knowledge and reasoning in multiple domains. We hope that EXAMS will enable researchers to explore challenging reasoning and knowledge transfer methods and pre-trained models for school question answering in various languages which was not possible before. The data, code, pre-trained models, and evaluation are available at https://github.com/mhardalov/exams-qa.
ExplanationLP: Abductive Reasoning for Explainable Science Question Answering
Thayaparan, Mokanarangan, Valentino, Marco, Freitas, Andrรฉ
We propose a novel approach for answering and explaining multiple-choice science questions by reasoning on grounding and abstract inference chains. This paper frames question answering as an abductive reasoning problem, constructing plausible explanations for each choice and then selecting the candidate with the best explanation as the final answer. Our system, ExplanationLP, elicits explanations by constructing a weighted graph of relevant facts for each candidate answer and extracting the facts that satisfy certain structural and semantic constraints. To extract the explanations, we employ a linear programming formalism designed to select the optimal subgraph. The graphs' weighting function is composed of a set of parameters, which we fine-tune to optimize answer selection performance. We carry out our experiments on the WorldTree and ARC-Challenge corpus to empirically demonstrate the following conclusions: (1) Grounding-Abstract inference chains provides the semantic control to perform explainable abductive reasoning (2) Efficiency and robustness in learning with a fewer number of parameters by outperforming contemporary explainable and transformer-based approaches in a similar setting (3) Generalisability by outperforming SOTA explainable approaches on general science question sets.
Knowledge Patterns
Clark, Peter, Thompson, John, Porter, Bruce
This Chapter describes a new technique, called "knowledge patterns", for helping construct axiom-rich, formal ontologies, based on identifying and explicitly representing recurring patterns of knowledge (theory schemata) in the ontology, and then stating how those patterns map onto domain-specific concepts in the ontology. From a modeling perspective, knowledge patterns provide an important insight into the structure of a formal ontology: rather than viewing a formal ontology simply as a list of terms and axioms, knowledge patterns views it as a collection of abstract, modular theories (the "knowledge patterns") plus a collection of modeling decisions stating how different aspects of the world can be modeled using those theories. Knowledge patterns make both those abstract theories and their mappings to the domain of interest explicit, thus making modeling decisions clear, and avoiding some of the ontological confusion that can otherwise arise. In addition, from a computational perspective, knowledge patterns provide a simple and computationally efficient mechanism for facilitating knowledge reuse. We describe the technique and an application built using them, and then critique its strengths and weaknesses. We conclude that this technique enables us to better explicate both the structure and modeling decisions made when constructing a formal axiom-rich ontology.
GenericsKB: A Knowledge Base of Generic Statements
Bhakthavatsalam, Sumithra, Anastasiades, Chloe, Clark, Peter
We present a new resource for the NLP community, namely a large (3.5M+ sentence) knowledge base of *generic statements*, e.g., "Trees remove carbon dioxide from the atmosphere", collected from multiple corpora. This is the first large resource to contain *naturally occurring* generic sentences, as opposed to extracted or crowdsourced triples, and thus is rich in high-quality, general, semantically complete statements. All GenericsKB sentences are annotated with their topical term, surrounding context (sentences), and a (learned) confidence. We also release GenericsKB-Best (1M+ sentences), containing the best-quality generics in GenericsKB augmented with selected, synthesized generics from WordNet and ConceptNet. In tests on two existing datasets requiring multihop reasoning (OBQA and QASC), we find using GenericsKB can result in higher scores and better explanations than using a much larger corpus. This demonstrates that GenericsKB can be a useful resource for NLP applications, as well as providing data for linguistic studies of generics and their semantics. GenericsKB is available at https://allenai.org/data/genericskb.
Natural Language QA Approaches using Reasoning with External Knowledge
Baral, Chitta, Banerjee, Pratyay, Pal, Kuntal Kumar, Mitra, Arindam
Question answering (QA) in natural language (NL) has been an important aspect of AI from its early days. Winograd's ``councilmen'' example in his 1972 paper and McCarthy's Mr. Hug example of 1976 highlights the role of external knowledge in NL understanding. While Machine Learning has been the go-to approach in NL processing as well as NL question answering (NLQA) for the last 30 years, recently there has been an increasingly emphasized thread on NLQA where external knowledge plays an important role. The challenges inspired by Winograd's councilmen example, and recent developments such as the Rebooting AI book, various NLQA datasets, research on knowledge acquisition in the NLQA context, and their use in various NLQA models have brought the issue of NLQA using ``reasoning'' with external knowledge to the forefront. In this paper, we present a survey of the recent work on them. We believe our survey will help establish a bridge between multiple fields of AI, especially between (a) the traditional fields of knowledge representation and reasoning and (b) the field of NL understanding and NLQA.