"Today's expert systems deal with domains of narrow specialization. For expert systems to perform competently over a broad range of tasks, they will have to be given very much more knowledge. ... The next generation of expert systems ... will require large knowledge bases. How will we get them?"
– Edward Feigenbaum, Pamela McCorduck, H. Penny Nii, from The Rise of the Expert Company. New York: Times Books, 1988.
The inclusion of'learning abilities' – mostly thought unique to humans and very few other evolved primates – defines artificial intelligence to a large extent. Faced with unfamiliar situations, how the program deals with the problems and attempts to solve them is key to identifying a stretch of software code as'artificially intelligent'. Artificial Intelligence has made the leap from science fiction to real life in a short matter of time. It was initially envisioned as a panacea for the intricate but repetitive processes that aided scientific research and technological advancement – a role it has fulfilled and, in many instances, surpassed. Training a program by making it understand a variety of sensory inputs, whether in the form of digital or analog data, does not mean that program has'intelligence'.
Knowledge-based support vector machines (KBSVMs) incorporate advice from domain experts, which can improve generalization significantly. A major limitation that has not been fully addressed occurs when the expert advice is imperfect, which can lead to poorer models. We propose a model that extends KBSVMs and is able to not only learn from data and advice, but also simultaneously improve the advice. The proposed approach is particularly effective for knowledge discovery in domains with few labeled examples. The proposed model contains bilinear constraints, and is solved using two iterative approaches: successive linear programming and a constrained concave-convex approach.
Causal inference is central to many areas of artificial intelligence, including complex reasoning, planning, knowledge-base construction, robotics, explanation, and fairness. An active community of researchers develops and enhances algorithms that learn causal models from data, and this work has produced a series of impressive technical advances. However, evaluation techniques for causal modeling algorithms have remained somewhat primitive, limiting what we can learn from experimental studies of algorithm performance, constraining the types of algorithms and model representations that researchers consider, and creating a gap between theory and practice. We argue for more frequent use of evaluation techniques that examine interventional measures rather than structural or observational measures, and that evaluate those measures on empirical data rather than synthetic data. We survey the current practice in evaluation and show that the techniques we recommend are rarely used in practice.
We present the Multi-value Rule Set (MRS) for interpretable classification with feature efficient presentations. Compared to rule sets built from single-value rules, MRS adopts a more generalized form of association rules that allows multiple values in a condition. Rules of this form are more concise than classical single-value rules in capturing and describing patterns in data. Our formulation also pursues a higher efficiency of feature utilization, which reduces possible cost in data collection and storage. We propose a Bayesian framework for formulating an MRS model and develop an efficient inference method for learning a maximum a posteriori, incorporating theoretically grounded bounds to iteratively reduce the search space and improve the search efficiency.
A common problem in knowledge representation and related fields is reasoning over a large joint knowledge graph, represented as triples of a relation between two entities. The goal of this paper is to develop a more powerful neural network model suitable for inference over these relationships. Previous models suffer from weak interaction between entities or simple linear projection of the vector space. We address these problems by introducing a neural tensor network (NTN) model which allow the entities and relations to interact multiplicatively. Additionally, we observe that such knowledge base models can be further improved by representing each entity as the average of vectors for the words in the entity name, giving an additional dimension of similarity by which entities can share statistical strength.
Statistical Relational Learning (SRL) methods are the most widely used techniques to generate distributional representations of the symbolic Knowledge Bases (KBs). These methods embed any given KB into a vector space by exploiting statistical similarities among its entities and predicates but without any guarantee of preserving the underlying logical structure of the KB. This, in turn, results in poor performance of logical reasoning tasks that are solved using such distributional representations. We present a novel approach called Embed2Reason (E2R) that embeds a symbolic KB into a vector space in a logical structure preserving manner. This approach is inspired by the theory of Quantum Logic.
This paper considers the learning of Boolean rules in either disjunctive normal form (DNF, OR-of-ANDs, equivalent to decision rule sets) or conjunctive normal form (CNF, AND-of-ORs) as an interpretable model for classification. An integer program is formulated to optimally trade classification accuracy for rule simplicity. Column generation (CG) is used to efficiently search over an exponential number of candidate clauses (conjunctions or disjunctions) without the need for heuristic rule mining. This approach also bounds the gap between the selected rule set and the best possible rule set on the training data. To handle large datasets, we propose an approximate CG algorithm using randomization.
In this work, we aim to leverage prior symbolic knowledge to improve the performance of deep models. We propose a graph embedding network that projects propositional formulae (and assignments) onto a manifold via an augmented Graph Convolutional Network (GCN). To generate semantically-faithful embeddings, we develop techniques to recognize node heterogeneity, and semantic regularization that incorporate structural constraints into the embedding. Experiments show that our approach improves the performance of models trained to perform entailment checking and visual relation prediction. Interestingly, we observe a connection between the tractability of the propositional theory representation and the ease of embedding.
We introduce deep neural networks for end-to-end differentiable theorem proving that operate on dense vector representations of symbols. These neural networks are recursively constructed by following the backward chaining algorithm as used in Prolog. Specifically, we replace symbolic unification with a differentiable computation on vector representations of symbols using a radial basis function kernel, thereby combining symbolic reasoning with learning subsymbolic vector representations. The resulting neural network can be trained to infer facts from a given incomplete knowledge base using gradient descent. By doing so, it learns to (i) place representations of similar symbols in close proximity in a vector space, (ii) make use of such similarities to prove facts, (iii) induce logical rules, and (iv) it can use provided and induced logical rules for complex multi-hop reasoning.