Learning higher-order logic programs

arXiv.org Artificial Intelligence

A key feature of inductive logic programming (ILP) is its ability to learn first-order programs, which are intrinsically more expressive than propositional programs. In this paper, we introduce techniques to learn higher-order programs. Specifically, we extend meta-interpretive learning (MIL) to support learning higher-order programs by allowing for \emph{higher-order definitions} to be used as background knowledge. Our theoretical results show that learning higher-order programs, rather than first-order programs, can reduce the textual complexity required to express programs which in turn reduces the size of the hypothesis space and sample complexity. We implement our idea in two new MIL systems: the Prolog system \namea{} and the ASP system \nameb{}. Both systems support learning higher-order programs and higher-order predicate invention, such as inventing functions for \tw{map/3} and conditions for \tw{filter/3}. We conduct experiments on four domains (robot strategies, chess playing, list transformations, and string decryption) that compare learning first-order and higher-order programs. Our experimental results support our theoretical claims and show that, compared to learning first-order programs, learning higher-order programs can significantly improve predictive accuracies and reduce learning times.


Learning Efficient Logic Programs

AAAI Conferences

Most logic-based machine learning algorithms rely on an Occamist bias where textual simplicity of hypotheses is optimised. This approach, however, fails to distinguish between the efficiencies of hypothesised programs, such as quick sort (O(n log n)) and bubble sort (O(n^2)). We address this issue by considering techniques to minimise both the resource complexity and textual complexity of hypothesised programs. We describe an algorithm proven to learn optimal resource complexity robot strategies, and we propose future work to generalise this approach to a broader class of logic programs.


Inductive general game playing

arXiv.org Artificial Intelligence

General game playing (GGP) is a framework for evaluating an agent's general intelligence across a wide range of tasks. In the GGP competition, an agent is given the rules of a game (described as a logic program) that it has never seen before. The task is for the agent to play the game, thus generating game traces. The winner of the GGP competition is the agent that gets the best total score over all the games. In this paper, we invert this task: a learner is given game traces and the task is to learn the rules that could produce the traces. This problem is central to inductive general game playing (IGGP). We introduce a technique that automatically generates IGGP tasks from GGP games. We introduce an IGGP dataset which contains traces from 50 diverse games, such as Sudoku, Sokoban, and Checkers. We claim that IGGP is difficult for existing inductive logic programming (ILP) approaches. To support this claim, we evaluate existing ILP systems on our dataset. Our empirical results show that most of the games cannot be correctly learned by existing systems. The best performing system solves only 40% of the tasks perfectly. Our results suggest that IGGP poses many challenges to existing approaches. Furthermore, because we can automatically generate IGGP tasks from GGP games, our dataset will continue to grow with the GGP competition, as new games are added every year. We therefore think that the IGGP problem and dataset will be valuable for motivating and evaluating future research.


Learning Efficient Logical Robot Strategies Involving Composable Objects

AAAI Conferences

Most logic-based machine learning algorithms rely on an Occamist bias where textual complexity of hypotheses is minimised. Within Inductive Logic Programming (ILP), this approach fails to distinguish between the efficiencies of hypothesised programs, such as quick sort (O(n log n)) and bubble sort (O(n 2 )). This paper addresses this issue by considering techniques to minimise both the textual complexity and resource complexity of hypothesised robot strategies. We develop a general framework for the problem of minimising resource complexity and show that on two robot strategy problems, 1) Postman 2) Sorter (recursively sort letters for delivery), the theoretical resource complexities of optimal strategies vary depending on whether objects can be composed within a strategy. The approach considered is an extension of Meta-Interpretive Learning (MIL), a recently developed paradigm in ILP which supports predicate invention and the learning of recursive logic programs. We introduce a new MIL implementation, Metagol O , and prove its convergence, with increasing numbers of randomly chosen examples to optimal strategies of this kind. Our experiments show that Metagol O learns theoretically optimal robot sorting strategies, which is in agreement with the theoretical predictions showing a clear divergence in resource requirements as the number of objects grows. To the authors’ knowledge this paper is the first demonstration of a learning algorithm able to learn optimal resource complexity robot strategies and algorithms for sorting lists.


Can Meta-Interpretive Learning outperform Deep Reinforcement Learning of Evaluable Game strategies?

arXiv.org Artificial Intelligence

World-class human players have been outperformed in a number of complex two person games (Go, Chess, Checkers) by Deep Reinforcement Learning systems. However, owing to tractability considerations minimax regret of a learning system cannot be evaluated in such games. In this paper we consider simple games (Noughts-and-Crosses and Hexapawn) in which minimax regret can be efficiently evaluated. We use these games to compare Cumulative Minimax Regret for variants of both standard and deep reinforcement learning against two variants of a new Meta-Interpretive Learning system called MIGO. In our experiments all tested variants of both normal and deep reinforcement learning have worse performance (higher cumulative minimax regret) than both variants of MIGO on Noughts-and-Crosses and Hexapawn. Additionally, MIGO's learned rules are relatively easy to comprehend, and are demonstrated to achieve significant transfer learning in both directions between Noughts-and-Crosses and Hexapawn.