Problem Solving
Improving Embedded Knowledge Graph Multi-hop Question Answering by introducing Relational Chain Reasoning
Jin, Weiqiang, Zhao, Biao, Yu, Hang, Tao, Xi, Yin, Ruiping, Liu, Guizhong
Knowledge Base Question Answering (KBQA) [1] is an attractive service mining and analytics method that has attracted extensive attention from academic and industrial circles in recent years. Given a natural language question, the KBQA system aims to answer the correct target entities from a given knowledge base (KB) [2]. It relies on certain capabilities including capturing rich semantic information to understand natural language questions clearly and seek correct answers in large scale structured knowledge databases accurately. Knowledge Graph Question Answering (KGQA) [3, 4] is a popular research branch of KBQA which uses a knowledge graph (KG) as its knowledge source [2, 5] and uses factoid triples stored in KG to answer natural language questions. Thanks to KG's unique data structure and its efficient querying capability, users can benefit from a more efficient acquisition of the substantial and valuable KG knowledge, and gain excellent customer experience.
How far have we come with Language Models part3(Artificial Intelligence)
Abstract: Conceptual knowledge is fundamental to human cognition and knowledge bases. However, existing knowledge probing works only focus on evaluating factual knowledge of pre-trained language models (PLMs) and ignore conceptual knowledge. Since conceptual knowledge often appears as implicit commonsense behind texts, designing probes for conceptual knowledge is hard. Inspired by knowledge representation schemata, we comprehensively evaluate conceptual knowledge of PLMs by designing three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts, respectively. For the tasks, we collect and annotate 24k data instances covering 393 concepts, which is COPEN, a COnceptual knowledge Probing bENchmark.
NeuroCERIL: Robotic Imitation Learning via Hierarchical Cause-Effect Reasoning in Programmable Attractor Neural Networks
Davis, Gregory P., Katz, Garrett E., Gentili, Rodolphe J., Reggia, James A.
Imitation learning allows social robots to learn new skills from human teachers without substantial manual programming, but it is difficult for robotic imitation learning systems to generalize demonstrated skills as well as human learners do. Contemporary neurocomputational approaches to imitation learning achieve limited generalization at the cost of data-intensive training, and often produce opaque models that are difficult to understand and debug. In this study, we explore the viability of developing purely-neural controllers for social robots that learn to imitate by reasoning about the underlying intentions of demonstrated behaviors. We present NeuroCERIL, a brain-inspired neurocognitive architecture that uses a novel hypothetico-deductive reasoning procedure to produce generalizable and human-readable explanations for demonstrated behavior. This approach combines bottom-up abductive inference with top-down predictive verification, and captures important aspects of human causal reasoning that are relevant to a broad range of cognitive domains. Our empirical results demonstrate that NeuroCERIL can learn various procedural skills in a simulated robotic imitation learning domain. We also show that its causal reasoning procedure is computationally efficient, and that its memory use is dominated by highly transient short-term memories, much like human working memory. We conclude that NeuroCERIL is a viable neural model of human-like imitation learning that can improve human-robot collaboration and contribute to investigations of the neurocomputational basis of human cognition.
BayesPCN: A Continually Learnable Predictive Coding Associative Memory
Associative memory plays an important role in human intelligence and its mechanisms have been linked to attention in machine learning. While the machine learning community's interest in associative memories has recently been rekindled, most work has focused on memory recall ($read$) over memory learning ($write$). In this paper, we present BayesPCN, a hierarchical associative memory capable of performing continual one-shot memory writes without meta-learning. Moreover, BayesPCN is able to gradually forget past observations ($forget$) to free its memory. Experiments show that BayesPCN can recall corrupted i.i.d. high-dimensional data observed hundreds to a thousand ``timesteps'' ago without a large drop in recall ability compared to the state-of-the-art offline-learned parametric memory models.
Accountable and Explainable Methods for Complex Reasoning over Text
A major concern of Machine Learning (ML) models is their opacity. They are deployed in an increasing number of applications where they often operate as black boxes that do not provide explanations for their predictions. Among others, the potential harms associated with the lack of understanding of the models' rationales include privacy violations, adversarial manipulations, and unfair discrimination. As a result, the accountability and transparency of ML models have been posed as critical desiderata by works in policy and law, philosophy, and computer science. In computer science, the decision-making process of ML models has been studied by developing accountability and transparency methods. Accountability methods, such as adversarial attacks and diagnostic datasets, expose vulnerabilities of ML models that could lead to malicious manipulations or systematic faults in their predictions. Transparency methods explain the rationales behind models' predictions gaining the trust of relevant stakeholders and potentially uncovering mistakes and unfairness in models' decisions. To this end, transparency methods have to meet accountability requirements as well, e.g., being robust and faithful to the underlying rationales of a model. This thesis presents my research that expands our collective knowledge in the areas of accountability and transparency of ML models developed for complex reasoning tasks over text.
Solving the Watchman Route Problem with Heuristic Search
Skyler, Shawn (Ben-Gurion University) | Atzmon, Dor (Ben-Gurion University) | Yaffe, Tamir (Ben-Gurion University) | Felner, Ariel
This paper solves the Watchman Route Problem (WRP) on a general discrete graph with Heuristic Search. Given a graph, a line-of-sight (LOS) function, and a start vertex, the task is to (offline) find a (shortest) path through the graph such that all vertices in the graph will be visually seen by at least one vertex on the path. WRP is reminiscent but different from graph covering and mapping problems, which are done online on an unknown graph. We formalize WRP as a heuristic search problem and solve it optimally with an A*-based algorithm. We develop a series of admissible heuristics with increasing difficulty and accuracy. Our heuristics abstract the underlying graph into a disjoint line-of-sight graph (GDLS) which is based on disjoint clusters of vertices such that vertices within the same cluster have LOS to the same specific vertex. We use solutions for the Minimum Spanning Tree (MST) and the Traveling Salesman Problem (TSP) of GDLS as admissible heuristics for WRP. We theoretically and empirically investigate these heuristics. Then, we show how the optimal methods can be modified (by intelligently pruning away large sub-trees) to obtain various suboptimal solvers with and without bound guarantees. These suboptimal solvers are much faster and expand fewer nodes than the optimal solver with only minor reduction in the quality of the solution.
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
Peng, Hao, Wang, Xiaozhi, Hu, Shengding, Jin, Hailong, Hou, Lei, Li, Juanzi, Liu, Zhiyuan, Liu, Qun
Conceptual knowledge is fundamental to human cognition and knowledge bases. However, existing knowledge probing works only focus on evaluating factual knowledge of pre-trained language models (PLMs) and ignore conceptual knowledge. Since conceptual knowledge often appears as implicit commonsense behind texts, designing probes for conceptual knowledge is hard. Inspired by knowledge representation schemata, we comprehensively evaluate conceptual knowledge of PLMs by designing three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts, respectively. For the tasks, we collect and annotate 24k data instances covering 393 concepts, which is COPEN, a COnceptual knowledge Probing bENchmark. Extensive experiments on different sizes and types of PLMs show that existing PLMs systematically lack conceptual knowledge and suffer from various spurious correlations. We believe this is a critical bottleneck for realizing human-like cognition in PLMs. COPEN and our codes are publicly released at https://github.com/THU-KEG/COPEN.
Humans decompose tasks by trading off utility and computational cost
Correa, Carlos G., Ho, Mark K., Callaway, Frederick, Daw, Nathaniel D., Griffiths, Thomas L.
Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions. How are these decompositions created and used? Here, we propose and evaluate a normative framework for task decomposition based on the simple idea that people decompose tasks to reduce the overall cost of planning while maintaining task performance. Analyzing 11,117 distinct graph-structured planning tasks, we find that our framework justifies several existing heuristics for task decomposition and makes predictions that can be distinguished from two alternative normative accounts. We report a behavioral study of task decomposition ($N=806$) that uses 30 randomly sampled graphs, a larger and more diverse set than that of any previous behavioral study on this topic. We find that human responses are more consistent with our framework for task decomposition than alternative normative accounts and are most consistent with a heuristic -- betweenness centrality -- that is justified by our approach. Taken together, our results provide new theoretical insight into the computational principles underlying the intelligent structuring of goal-directed behavior.
Final Report on MITRE Evaluations for the DARPA Big Mechanism Program
Peterson, Matthew, Korves, Tonia, Garay, Christopher, Kozierok, Robyn, Hirschman, Lynette
This report presents the evaluation approach developed for the DARPA Big Mechanism program, which aimed at developing computer systems that will read research papers, integrate the information into a computer model of cancer mechanisms, and frame new hypotheses. We employed an iterative, incremental approach to the evaluation of the three phases of the program. In Phase I, we evaluated the ability of system and human teams ability to read-with-a-model to capture mechanistic information from the biomedical literature, integrated with information from expert curated biological databases. In Phase II we evaluated the ability of systems to assemble fragments of information into a mechanistic model. The Phase III evaluation focused on the ability of systems to provide explanations of experimental observations based on models assembled (largely automatically) by the Big Mechanism process. The evaluation for each phase built on earlier evaluations and guided developers towards creating capabilities for the new phase. The report describes our approach, including innovations such as a reference set (a curated data set limited to major findings of each paper) to assess the accuracy of systems in extracting mechanistic findings in the absence of a gold standard, and a method to evaluate model-based explanations of experimental data. Results of the evaluation and supporting materials are included in the appendices.
Learning Probabilistic Temporal Safety Properties from Examples in Relational Domains
Rens, Gavin, Yang, Wen-Chi, Raskin, Jean-François, De Raedt, Luc
Many recent publications report on methods for achieving safety in Markov Decision Processes (MDPs), where temporal logic (safety) specifications must be satisfied [1-4]. However, it is typically assumed that 1) the safety specification is given, and 2) that the states in the underlying MDP are unstructured. In this paper, we are interested in 1) learning the safety specification from examples, and 2) working with relational MDPs. More specifically, in our learning setting we assume that there is a domain expert who is presented with a set of system states E, a probability threshold α and a step-bound k (number of action executions). If the expert believes that the system, starting in s E will perform actions that lead to a dangerous temporal situation within k steps with probability at least α, then she will label s as dangerous, else, as safe. Now, given this set E of labeled states, we want to learn a compact temporal logic formula summarizing the expert's advice. There are at least three reasons to infer a property (expressed as a temporal logic formula) from an expert's advice. Firstly, to obtain a concise, human-interpretable expression of some aspects of the domain [5-7], secondly, to verify a system's control behavior (policy) w.r.t. a set of (safety) standards [6, 8] and thirdly, to use the (safety) property to devise strategies for the system or agent to avoid undesirable situations [8-10]. Furthermore, we consider systems that can be modelled as relational MDPs (RMDPs).