Goto

Collaborating Authors

 raedt


Semirings for Probabilistic and Neuro-Symbolic Logic Programming

Derkinderen, Vincent, Manhaeve, Robin, Martires, Pedro Zuidberg Dos, De Raedt, Luc

arXiv.org Artificial Intelligence

The original framework of Poole and Sato extended the logic programming language Prolog (Flach, 1994) with probabilistic facts. These are facts that are annotated with the probability that they are true; they play a role similar to the parentless nodes in Bayesian networks in that they are marginally independent of one another, and that the probabilistic dependencies are induced by the rules of the logic program. This resulted in the celebrated distribution semantics (Sato, 1995) that is the basis of probabilistic logic programming, and the corresponding learning algorithm in the PRISM language (Sato, 1995) constitutes - to the best of the authors' knowledge - the very first probabilistic programming language with built-in support for machine learning. The work of Sato and Poole has inspired many follow-up works on inference and learning, and has also introduced many variations and extensions of the probabilistic logic programming and its celebrated distribution semantics.


First-Order Context-Specific Likelihood Weighting in Hybrid Probabilistic Logic Programs

Kumar, Nitesh (a:1:{s:5:"en_US";s:9:"KU Leuven";}) | Kuželka, Ondřej (CTU in Prague) | De Raedt, Luc (KU Leuven)

Journal of Artificial Intelligence Research

Statistical relational AI and probabilistic logic programming have so far mostly focused on discrete probabilistic models. The reasons for this is that one needs to provide constructs to succinctly model the independencies in such models, and also provide efficient inference. Three types of independencies are important to represent and exploit for scalable inference in hybrid models: conditional independencies elegantly modeled in Bayesian networks, context-specific independencies naturally represented by logical rules, and independencies amongst attributes of related objects in relational models succinctly expressed by combining rules. This paper introduces a hybrid probabilistic logic programming language, DC#, which integrates distributional clauses' syntax and semantics principles of Bayesian logic programs. It represents the three types of independencies qualitatively. More importantly, we also introduce the scalable inference algorithm FO-CS-LW for DC#. FO-CS-LW is a first-order extension of the context-specific likelihood weighting algorithm (CS-LW), a novel sampling method that exploits conditional independencies and context-specific independencies in ground models. The FO-CS-LW algorithm upgrades CS-LW with unification and combining rules to the first-order case.


A Markov Framework for Learning and Reasoning About Strategies in Professional Soccer

Van Roy, Maaike (a:1:{s:5:"en_US";s:9:"KU Leuven";}) | Robberechts, Pieter | Yang, Wen-Chi | De Raedt, Luc | Davis, Jesse

Journal of Artificial Intelligence Research

Strategy-optimization is a fundamental element of dynamic and complex team sports such as soccer, American football, and basketball. As the amount of data that is collected from matches in these sports has increased, so has the demand for data-driven decisionmaking support. If alternative strategies need to be balanced, a data-driven approach can uncover insights that are not available from qualitative analysis. This could tremendously aid teams in their match preparations. In this work, we propose a novel Markov modelbased framework for soccer that allows reasoning about the specific strategies teams use in order to gain insights into the efficiency of each strategy. The framework consists of two components: (1) a learning component, which entails modeling a team’s offensive behavior by learning a Markov decision process (MDP) from event data that is collected from the team’s matches, and (2) a reasoning component, which involves a novel application of probabilistic model checking to reason about the efficacy of the learned strategies of each team. In this paper, we provide an overview of this framework and illustrate it on several use cases using real-world event data from three leagues. Our results show that the framework can be used to reason about the shot decision-making of teams and to optimise the defensive strategies used when playing against a particular team. The general ideas presented in this framework can easily be extended to other sports.


De Raedt

AAAI Conferences

We study the problem of inducing logic programs in a probabilistic setting, in which both the example descriptions and their classification can be probabilistic. The setting is incorporated in the probabilistic rule learner ProbFOIL, which combines principles of the rule learner FOIL with ProbLog, a probabilistic Prolog. We illustrate the approach by applying it to the knowledge base of NELL, the Never-Ending Language Learner.


First-Order Context-Specific Likelihood Weighting in Hybrid Probabilistic Logic Programs

Kumar, Nitesh, Kuzelka, Ondrej, De Raedt, Luc

arXiv.org Artificial Intelligence

Statistical relational AI and probabilistic logic programming have so far mostly focused on discrete probabilistic models. The reasons for this is that one needs to provide constructs to succinctly model the independencies in such models, and also provide efficient inference. Three types of independencies are important to represent and exploit for scalable inference in hybrid models: conditional independencies elegantly modeled in Bayesian networks, context-specific independencies naturally represented by logical rules, and independencies amongst attributes of related objects in relational models succinctly expressed by combining rules. This paper introduces a hybrid probabilistic logic programming language, DC#, which integrates distributional clauses' syntax and semantics principles of Bayesian logic programs. It represents the three types of independencies qualitatively. More importantly, we also introduce the scalable inference algorithm FO-CS-LW for DC#. FO-CS-LW is a first-order extension of the context-specific likelihood weighting algorithm (CS-LW), a novel sampling method that exploits conditional independencies and context-specific independencies in ground models.


Symbolic Logic meets Machine Learning: A Brief Survey in Infinite Domains

Belle, Vaishak

arXiv.org Artificial Intelligence

The tension between deduction and induction is perhaps the most fundamental issue in areas such as philosophy, cognition and artificial intelligence (AI). The deduction camp concerns itself with questions about the expressiveness of formal languages for capturing knowledge about the world, together with proof systems for reasoning from such knowledge bases. The learning camp attempts to generalize from examples about partial descriptions about the world. In AI, historically, these camps have loosely divided the development of the field, but advances in cross-over areas such as statistical relational learning, neuro-symbolic systems, and high-level control have illustrated that the dichotomy is not very constructive, and perhaps even ill-formed. In this article, we survey work that provides further evidence for the connections between logic and learning. Our narrative is structured in terms of three strands: logic versus learning, machine learning for logic, and logic for machine learning, but naturally, there is considerable overlap. We place an emphasis on the following "sore" point: there is a common misconception that logic is for discrete properties, whereas probability theory and machine learning, more generally, is for continuous properties. We report on results that challenge this view on the limitations of logic, and expose the role that logic can play for learning in infinite domains.


Human-Machine Collaboration for Democratizing Data Science

Gautrais, Clément, Dauxais, Yann, Teso, Stefano, Kolb, Samuel, Verbruggen, Gust, De Raedt, Luc

arXiv.org Artificial Intelligence

Data science is a cornerstone of current business practices. A major obstacle to its adoption is that most data analysis techniques are beyond the reach of typical end-users. Spreadsheets are a prime example of this phenomenon: despite being central in all sorts of data processing pipelines, the functionality necessary for processing and analyzing spreadsheets is hidden behind the high wall of spreadsheet formulas, which most end-users can neither write nor understand [Chambers and Scaffidi, 2010]. As a result, spreadsheets are often manipulated and analyzed manually. This increases the chance of making mistakes and prevents scaling beyond small data sets. Lowering the barrier to entry for specifying and solving data science tasks would help ameliorating these issues. Making data science tools more accessible would lower the cost of designing data processing pipelines and taking datadriven decisions. At the same time, accessible data science tools can prevent non-experts from relying on fragile heuristics and improvised solutions. The question we ask is then: is it possible to enable nontechnical end-users to specify and solve data science tasks that match their needs?


SMT + ILP

Belle, Vaishak

arXiv.org Artificial Intelligence

Inductive logic programming (ILP) has been a deeply influential paradigm in AI, enjoying decades of research on its theory and implementations. As a natural descendent of the fields of logic programming and machine learning, it admits the incorporation of background knowledge, which can be very useful in domains where prior knowledge from experts is available and can lead to a more data-efficient learning regime. Be that as it may, the limitation to Horn clauses composed over Boolean variables is a very serious one. Many phenomena occurring in the real-world are best characterized using continuous entities, and more generally, mixtures of discrete and continuous entities. In this position paper, we motivate a reconsideration of inductive declarative programming by leveraging satisfiability modulo theory technology.


Learning Probabilistic Logic Programs in Continuous Domains

Speichert, Stefanie, Belle, Vaishak

arXiv.org Artificial Intelligence

The field of statistical relational learning aims at unifying logic and probability to reason and learn from data. Perhaps the most successful paradigm in the field is probabilistic logic programming: the enabling of stochastic primitives in logic programming, which is now increasingly seen to provide a declarative background to complex machine learning applications. While many systems offer inference capabilities, the more significant challenge is that of learning meaningful and interpretable symbolic representations from data. In that regard, inductive logic programming and related techniques have paved much of the way for the last few decades. Unfortunately, a major limitation of this exciting landscape is that much of the work is limited to finite-domain discrete probability distributions. Recently, a handful of systems have been extended to represent and perform inference with continuous distributions. The problem, of course, is that classical solutions for inference are either restricted to well-known parametric families (e.g., Gaussians) or resort to sampling strategies that provide correct answers only in the limit. When it comes to learning, moreover, inducing representations remains entirely open, other than "data-fitting" solutions that force-fit points to aforementioned parametric families. In this paper, we take the first steps towards inducing probabilistic logic programs for continuous and mixed discrete-continuous data, without being pigeon-holed to a fixed set of distribution families. Our key insight is to leverage techniques from piecewise polynomial function approximation theory, yielding a principled way to learn and compositionally construct density functions. We test the framework and discuss the learned representations.


On Declarative Modeling of Structured Pattern Mining

Guns, Tias (KU Leuven) | Paramonov, Sergey (KU Leuven) | Negrevergne, Benjamin (Inria Rennes)

AAAI Conferences

Since the seminal work on frequent itemset mining, there has been considerable effort on mining more structured patterns such as sequences or graphs. Additionally, the field of constraint programming has been linked to the field of pattern mining resulting in a more general and declarative constraint-based itemset mining framework. As a result, a number of recent papers have proposed to extend the declarative approach to structured pattern mining problems. Because the formalism and the solving mechanisms can be vastly different in specialised algorithm and declarative approaches, assessing the benefits and the drawbacks of each approach can be difficult. In this paper, we introduce a framework that formally defines the core components of itemset, sequence and graph mining tasks, and we use it to compare existing specialised algorithms to their declarative counterpart. This analysis allows us to draw clear connections between the two approaches and provide insights on how to overcome current limitations in declarative structured mining.