Goto

Collaborating Authors

 Schiex, Thomas


Scalable Coupling of Deep Learning with Logical Reasoning

arXiv.org Artificial Intelligence

In the ongoing quest for hybridizing discrete reasoning with neural nets, there is an increasing interest in neural architectures that can learn how to solve discrete reasoning or optimization problems from natural inputs. In this paper, we introduce a scalable neural architecture and loss function dedicated to learning the constraints and criteria of NP-hard reasoning problems expressed as discrete Graphical Models. Our loss function solves one of the main limitations of Besag's pseudo-loglikelihood, enabling learning of high energies. We empirically show it is able to efficiently learn how to solve NP-hard reasoning problems from natural inputs as the symbolic, visual or many-solutions Sudoku problems as well as the energy optimization formulation of the protein design problem, providing data efficiency, interpretability, and \textit{a posteriori} control over predictions.


Efficient semidefinite bounds for multi-label discrete graphical models

arXiv.org Artificial Intelligence

By concisely representing a joint function of many variables as the combination of small functions, discrete graphical models (GMs) provide a powerful framework to analyze stochastic and deterministic systems of interacting variables. One of the main queries on such models is to identify the extremum of this joint function. This is known as the Weighted Constraint Satisfaction Problem (WCSP) on deterministic Cost Function Networks and as Maximum a Posteriori (MAP) inference on stochastic Markov Random Fields. Algorithms for approximate WCSP inference typically rely on local consistency algorithms or belief propagation. These methods are intimately related to linear programming (LP) relaxations and often coupled with reparametrizations defined by the dual solution of the associated LP. Since the seminal work of Goemans and Williamson, it is well understood that convex SDP relaxations can provide superior guarantees to LP. But the inherent computational cost of interior point methods has limited their application. The situation has improved with the introduction of non-convex Burer-Monteiro style methods which are well suited to handle the SDP relaxation of combinatorial problems with binary variables (such as MAXCUT, MaxSAT or MAP/Ising). We compute low rank SDP upper and lower bounds for discrete pairwise graphical models with arbitrary number of values and arbitrary binary cost functions by extending a Burer-Monteiro style method based on row-by-row updates. We consider a traditional dualized constraint approach and a dedicated Block Coordinate Descent approach which avoids introducing large penalty coefficients to the formulation. On increasingly hard and dense WCSP/CFN instances, we observe that the BCD approach can outperform the dualized approach and provide tighter bounds than local consistencies/convergent message passing approaches.


Exact and approximate inference in graphical models: variable elimination and beyond

arXiv.org Artificial Intelligence

Probabilistic graphical models offer a powerful framework to account for the dependence structure between variables, which is represented as a graph. However, the dependence between variables may render inference tasks intractable. In this paper we review techniques exploiting the graph structure for exact inference, borrowed from optimisation and computer science. They are built on the principle of variable elimination whose complexity is dictated in an intricate way by the order in which variables are eliminated. The so-called treewidth of the graph characterises this algorithmic complexity: low-treewidth graphs can be processed efficiently. The first message that we illustrate is therefore the idea that for inference in graphical model, the number of variables is not the limiting factor, and it is worth checking for the treewidth before turning to approximate methods. We show how algorithms providing an upper bound of the treewidth can be exploited to derive a 'good' elimination order enabling to perform exact inference. The second message is that when the treewidth is too large, algorithms for approximate inference linked to the principle of variable elimination, such as loopy belief propagation and variational approaches, can lead to accurate results while being much less time consuming than Monte-Carlo approaches. We illustrate the techniques reviewed in this article on benchmarks of inference problems in genetic linkage analysis and computer vision, as well as on hidden variables restoration in coupled Hidden Markov Models.


Tractability and Decompositions of Global Cost Functions

arXiv.org Artificial Intelligence

Enforcing local consistencies in cost function networks is performed by applying so-called Equivalent Preserving Transformations (EPTs) to the cost functions. As EPTs transform the cost functions, they may break the property that was making local consistency enforcement tractable on a global cost function. A global cost function is called tractable projection-safe when applying an EPT to it is tractable and does not break the tractability property. In this paper, we prove that depending on the size r of the smallest scopes used for performing EPTs, the tractability of global cost functions can be preserved (r = 0) or destroyed (r > 1). When r = 1, the answer is indefinite. We show that on a large family of cost functions, EPTs can be computed via dynamic programming-based algorithms, leading to tractable projection-safety. We also show that when a global cost function can be decomposed into a Berge acyclic network of bounded arity cost functions, soft local consistencies such as soft Directed or Virtual Arc Consistency can directly emulate dynamic programming. These different approaches to decomposable cost functions are then embedded in a solver for extensive experiments that confirm the feasibility and efficiency of our proposal.


Bounds Arc Consistency for Weighted CSPs

arXiv.org Artificial Intelligence

The Weighted Constraint Satisfaction Problem (WCSP) framework allows representing and solving problems involving both hard constraints and cost functions. It has been applied to various problems, including resource allocation, bioinformatics, scheduling, etc. To solve such problems, solvers usually rely on branch-and-bound algorithms equipped with local consistency filtering, mostly soft arc consistency. However, these techniques are not well suited to solve problems with very large domains. Motivated by the resolution of an RNA gene localization problem inside large genomic sequences, and in the spirit of bounds consistency for large domains in crisp CSPs, we introduce soft bounds arc consistency, a new weighted local consistency specifically designed for WCSP with very large domains. Compared to soft arc consistency, BAC provides significantly improved time and space asymptotic complexity. In this paper, we show how the semantics of cost functions can be exploited to further improve the time complexity of BAC. We also compare both in theory and in practice the efficiency of BAC on a WCSP with bounds consistency enforced on a crisp CSP using cost variables. On two different real problems modeled as WCSP, including our RNA gene localization problem, we observe that maintaining bounds arc consistency outperforms arc consistency and also improves over bounds consistency enforced on a constraint model with cost variables.


Possibilistic Constraint Satisfaction Problems or "How to handle soft constraints?"

arXiv.org Artificial Intelligence

Many AI synthesis problems such as planning or scheduling may be modelized as constraint satisfaction problems (CSP). A CSP is typically defined as the problem of finding any consistent labeling for a fixed set of variables satisfying all given constraints between these variables. However, for many real tasks such as job-shop scheduling, time-table scheduling, design?, all these constraints have not the same significance and have not to be necessarily satisfied. A first distinction can be made between hard constraints, which every solution should satisfy and soft constraints, whose satisfaction has not to be certain. In this paper, we formalize the notion of possibilistic constraint satisfaction problems that allows the modeling of uncertainly satisfied constraints. We use a possibility distribution over labelings to represent respective possibilities of each labeling. Necessity-valued constraints allow a simple expression of the respective certainty degrees of each constraint. The main advantage of our approach is its integration in the CSP technical framework. Most classical techniques, such as Backtracking (BT), arcconsistency enforcing (AC) or Forward Checking have been extended to handle possibilistics CSP and are effectively implemented. The utility of our approach is demonstrated on a simple design problem.


Filtering Decomposable Global Cost Functions

AAAI Conferences

As (Lee et al., 2012) have shown, weighted constraint satisfaction problems can benefit from the introduction of global cost functions, leading to a new Cost Function Programming paradigm. In this paper, we explore the possibility of decomposing global cost functions in such a way that enforcing soft local consistencies on the decomposition offers guarantees on the level of consistency enforced on the original global cost function. We show that directional arc consistency and virtual arc consistency offer such guarantees. We conclude by experiments on decomposable cost functions showing that decompositions may be very useful to easily integrate efficient global cost functions in solvers.


From influence diagrams to multi-operator cluster DAGs

arXiv.org Artificial Intelligence

There exist several architectures to solve influence diagrams using local computations, such as the Shenoy-Shafer, the HUGIN, or the Lazy Propagation architectures. They all extend usual variable elimination algorithms thanks to the use of so-called 'potentials'. In this paper, we introduce a new architecture, called the Multi-operator Cluster DAG architecture, which can produce decompositions with an improved constrained induced-width, and therefore induce potentially exponential gains. Its principle is to benefit from the composite nature of influence diagrams, instead of using uniform potentials, in order to better analyze the problem structure.