Goto

Collaborating Authors

 Problem Solving


What A Situated Language-Using Agent Must be Able to Do: A Top-Down Analysis

arXiv.org Artificial Intelligence

Even in our increasingly text-intensive times, the primary site of language use is situated, co-present interaction. It is primary ontogenetically and phylogenetically, and it is arguably also still primary in negotiating everyday social situations. Situated interaction is also the final frontier of Natural Language Processing, where, compared to the area of text processing, very little progress has been made in the past decade, and where a myriad of practical applications is waiting to be unlocked. While the usual approach in the field is to reach, bottom-up, for the ever next "adjacent possible", in this paper I attempt a top-down analysis of what the demands are that unrestricted situated interaction makes on the participating agent, and suggest ways in which this analysis can structure computational models and research on them. Specifically, I discuss representational demands (the building up and application of world model, language model, situation model, discourse model, and agent model) and what I call anchoring processes (incremental processing, incremental learning, conversational grounding, multimodal grounding) that bind the agent to the here, now, and us.


Causal Reasoning of Entities and Events in Procedural Texts

arXiv.org Artificial Intelligence

Entities and events are crucial to natural language reasoning and common in procedural texts. Existing work has focused either exclusively on entity state tracking (e.g., whether a pan is hot) or on event reasoning (e.g., whether one would burn themselves by touching the pan), while these two tasks are often causally related. We propose CREPE, the first benchmark on causal reasoning of event plausibility and entity states. We show that most language models, including GPT-3, perform close to chance at .35 F1, lagging far behind human at .87 F1. We boost model performance to .59 F1 by creatively representing events as programming languages while prompting language models pretrained on code. By injecting the causal relations between entities and events as intermediate reasoning steps in our representation, we further boost the performance to .67 F1. Our findings indicate not only the challenge that CREPE brings for language models, but also the efficacy of code-like prompting combined with chain-of-thought prompting for multihop event reasoning.


Empirical Investigation of Neural Symbolic Reasoning Strategies

arXiv.org Artificial Intelligence

Neural reasoning accuracy improves when generating intermediate reasoning steps. However, the source of this improvement is yet unclear. Here, we investigate and factorize the benefit of generating intermediate steps for symbolic reasoning. Specifically, we decompose the reasoning strategy w.r.t. step granularity and chaining strategy. With a purely symbolic numerical reasoning dataset (e.g., A=1, B=3, C=A+3, C?), we found that the choice of reasoning strategies significantly affects the performance, with the gap becoming even larger as the extrapolation length becomes longer. Surprisingly, we also found that certain configurations lead to nearly perfect performance, even in the case of length extrapolation. Our results indicate the importance of further exploring effective strategies for neural reasoning models.


A survey on knowledge-enhanced multimodal learning

AIHub

Multimodal learning is a field of increasing interest in the research community, as it is more closely aligned to the way a human perceives the world: a combination of visual information, language, sounds, and other senses provides complementary insights regarding the world state. Significant advancements in unimodal learning, such as the advent of transformers, boosted the capabilities of multimodal approaches, not only in terms of task-specific performance but also regarding the ability to develop multi-task models. Nevertheless, even such powerful multimodal approaches present shortcomings when it comes to reasoning beyond before-seen knowledge, even if that knowledge refers to simple everyday situations such as "in very cold temperatures the water freezes". This is where external knowledge sources can contribute to enhance model performance by providing such pieces of missing information. The term "knowledge-enhanced" refers to any model utilizing external (or even internal) knowledge sources to extend their predictive capabilities beyond the knowledge that can be extracted from datasets learned during the training phase.


Augmented Language Models: a Survey

arXiv.org Artificial Intelligence

This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in calling external modules such as a code interpreter. LMs can leverage these augmentations separately or in combination via heuristics, or learn to do so from demonstrations. While adhering to a standard missing tokens prediction objective, such augmented LMs can use various, possibly non-parametric external modules to expand their context processing ability, thus departing from the pure language modeling paradigm. We therefore refer to them as Augmented Language Models (ALMs). The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks and even outperforming most regular LMs on several benchmarks. In this work, after reviewing current advance in ALMs, we conclude that this new research direction has the potential to address common limitations of traditional LMs such as interpretability, consistency, and scalability issues.


Neurosymbolic AI for Reasoning on Graph Structures: A Survey

arXiv.org Artificial Intelligence

Neurosymbolic AI is an increasingly active area of research which aims to combine symbolic reasoning methods with deep learning to generate models with both high predictive performance and some degree of human-level comprehensibility. As knowledge graphs are becoming a popular way to represent heterogeneous and multi-relational data, methods for reasoning on graph structures have attempted to follow this neurosymbolic paradigm. Traditionally, such approaches have utilized either rule-based inference or generated representative numerical embeddings from which patterns could be extracted. However, several recent studies have attempted to bridge this dichotomy in ways that facilitate interpretability, maintain performance, and integrate expert knowledge. Within this article, we survey a breadth of methods that perform neurosymbolic reasoning tasks on graph structures. To better compare the various methods, we propose a novel taxonomy by which we can classify them. Specifically, we propose three major categories: (1) logically-informed embedding approaches, (2) embedding approaches with logical constraints, and (3) rule-learning approaches. Alongside the taxonomy, we provide a tabular overview of the approaches and links to their source code, if available, for more direct comparison. Finally, we discuss the applications on which these methods were primarily used and propose several prospective directions toward which this new field of research could evolve.


STREET: A Multi-Task Structured Reasoning and Explanation Benchmark

arXiv.org Artificial Intelligence

Unlike most existing question-answering (QA) datasets, we expect models to not only answer questions, but also produce step-by-step structured explanations describing how premises in the question are used to produce intermediate conclusions that can prove the correctness of a certain answer. We perform extensive evaluation with popular language models such as few-shot prompting GPT-3 and fine-tuned T5. We find that these models still lag behind human performance when producing such structured reasoning steps. We believe this work will provide a way for the community to better train and test systems on multi-step reasoning and explanations in natural language. A long-term pursuit in Artificial Intelligence is to endow machines with the ability to reason and manipulate premises to reach conclusions and perform tasks. Some recent works in the field of question-answering (QA) have demonstrated that language models can bypass some of these issues and learn to reason directly over natural language (Clark et al., 2020), allowing for more flexible and adaptable reasoning capabilities. Another advantage of performing multi-step reasoning over natural language is that it allows for more inspectable outputs, improving the explainability of models that are otherwise regarded as black box systems (Jain & Wallace, 2019; Rajani et al., 2019a; Danilevsky et al., 2020). Despite the recent progress, we notice that there is still a gap in resources for training and evaluating general reasoning capabilities over natural language. We build upon existing QA datasets by adding multi-premise, multi-step, structured explanations in the form of reasoning graphs, as depicted in Figure 1. When combined, all reasoning graphs contain a total of 151.1k reasoning steps (or textual entailments), of which 14.7k were created by our expert annotators.


Learning by Applying: A General Framework for Mathematical Reasoning via Enhancing Explicit Knowledge Learning

arXiv.org Artificial Intelligence

Mathematical reasoning is one of the crucial abilities of general artificial intelligence, which requires machines to master mathematical logic and knowledge from solving problems. However, existing approaches are not transparent (thus not interpretable) in terms of what knowledge has been learned and applied in the reasoning process. In this paper, we propose a general Learning by Applying (LeAp) framework to enhance existing models (backbones) in a principled way by explicit knowledge learning. In LeAp, we perform knowledge learning in a novel problem-knowledge-expression paradigm, with a Knowledge Encoder to acquire knowledge from problem data and a Knowledge Decoder to apply knowledge for expression reasoning. The learned mathematical knowledge, including word-word relations and word-operator relations, forms an explicit knowledge graph, which bridges the knowledge "learning" and "applying" organically. Moreover, for problem solving, we design a semantics-enhanced module and a reasoning-enhanced module that apply knowledge to improve the problem comprehension and symbol reasoning abilities of any backbone, respectively. We theoretically prove the superiority of LeAp's autonomous learning mechanism. Experiments on three real-world datasets show that LeAp improves all backbones' performances, learns accurate knowledge, and achieves a more interpretable reasoning process.


Shrinking the Inductive Programming Search Space with Instruction Subsets

arXiv.org Artificial Intelligence

Inductive programming frequently relies on some form of search in order to identify candidate solutions. However, the size of the search space limits the use of inductive programming to the production of relatively small programs. If we could somehow correctly predict the subset of instructions required for a given problem then inductive programming would be more tractable. We will show that this can be achieved in a high percentage of cases. This paper presents a novel model of programming language instruction co-occurrence that was built to support search space partitioning in the Zoea distributed inductive programming system. This consists of a collection of intersecting instruction subsets derived from a large sample of open source code. Using the approach different parts of the search space can be explored in parallel. The number of subsets required does not grow linearly with the quantity of code used to produce them and a manageable number of subsets is sufficient to cover a high percentage of unseen code. This approach also significantly reduces the overall size of the search space - often by many orders of magnitude.


Plan-Based Derivation of General Functional Structures in Product Design

arXiv.org Artificial Intelligence

In product design, a decomposition of the overall product function into a set of smaller, interacting functions is usually considered a crucial first step for any computer-supported design tool. Here, we propose a new approach for the decomposition of functions especially suited for later solutions based on Artificial Intelligence. The presented approach defines the decomposition problem in terms of a planning problem--a well established field in Artificial Intelligence. For the planning problem, logic-based solvers can be used to find solutions that compute a useful function structure for the design process. Well-known function libraries from engineering are used as atomic planning steps. The algorithms are evaluated using two different application examples to ensure the transferability of a general function decomposition.