Constraint-Based Reasoning
A Survey on String Constraint Solving
They are a fundamental datatype in all the modern programming languages, and operations on strings frequently occur in disparate fields such as software analysis, model checking, database applications, web security, bioinformatics and so on[3, 11, 19, 21, 27, 28, 49, 60, 67]. Reasoning over strings requires solving arbitrarily complex string constraints, i.e., relations defined on a number of string variables. Typical examples of string constraints are string length, (dis-)equality, concatenation, substring, regular expression matching. With the term "string constraint solving" (in short, string solving or SCS) we refer to the process of modelling, processing, and solving combinatorial problems involving string constraints. We may see SCS as a declarative paradigm which falls at the intersection between constraint solving and combinatorics on words: the user states a problem with string variables and constraints, and a suitable string solver seeks a solution for that problem. Although works on the combinatorics of words were already published in the 1940s [110], the dawn of SCS dates back to the late 1980s in correspondence with the rise of Constraint Programming (CP) [114] and Constraint Logic Programming(CLP)[73] paradigms. Pioneers in this field were for example Trilogy[142], a language providing strings, integer and real constraints, and CLP(Σ) [144], an instance of the CLP scheme representing strings as regular sets. The latter in particular was the first known attempt to use string constraints like regular membership to denote regular sets.
Bringing freedom in variable choice when searching counter-examples in floating point programs
Zitoun, Heytem, Michel, Claude, Michel, Laurent, Rueher, Michel
Program verification techniques typically focus on finding counterexamples that violate properties of a program. Constraint programming offers a convenient way to verify programs by modeling their state transformations and specifying searches that seek counterexamples. Floating-point computations present additional challenges for verification given the semantic subtleties of floating point arithmetic. This paper focuses on search strategies for CSPs using floating point numbers constraint systems and dedicated to program verification. It introduces a new search heuristic based on the global number of occurrences that outperforms state-of-the-art strategies. More importantly, it demonstrates that a new technique that only branches on input variables of the verified program improve performance. It composes with a diversification technique that prevents the selection of the same variable within a fixed horizon further improving performances and reduces disparities between various variable choice heuristics. The result is a robust methodology that can tailor the search strategy according to the sought properties of the counter example.
An efficient constraint based framework forhandling floating point SMT problems
Zitoun, Heytem, Michel, Claude, Michel, Laurent, Rueher, Michel
This paper introduces the 2019 version of \us{}, a novel Constraint Programming framework for floating point verification problems expressed with the SMT language of SMTLIB. SMT solvers decompose their task by delegating to specific theories (e.g., floating point, bit vectors, arrays, ...) the task to reason about combinatorial or otherwise complex constraints for which the SAT encoding would be cumbersome or ineffective. This decomposition and encoding processes lead to the obfuscation of the high-level constraints and a loss of information on the structure of the combinatorial model. In \us{}, constraints over the floats are first class objects, and the purpose is to expose and exploit structures of floating point domains to enhance the search process. A symbolic phase rewrites each SMTLIB instance to elementary constraints, and eliminates auxiliary variables whose presence is counterproductive. A diversification technique within the search steers it away from costly enumerations in unproductive areas of the search space. The empirical evaluation demonstrates that the 2019 version of \us{} is competitive on computationally challenging floating point benchmarks that induce significant search efforts even for other CP solvers. It highlights that the ability to harness both inference and search is critical. Indeed, it yields a factor 3 improvement over Colibri and is up to 10 times faster than SMT solvers. The evaluation was conducted over 214 benchmarks (The Griggio suite) which is a standard within SMTLIB.
Teaching the Old Dog New Tricks: Supervised Learning with Constraints
Detassis, Fabrizio, Lombardi, Michele, Milano, Michela
Methods for taking into account external knowledge in Machine Learning models have the potential to address outstanding issues in data-driven AI methods, such as improving safety and fairness, and can simplify training in the presence of scarce data. We propose a simple, but effective, method for injecting constraints at training time in supervised learning, based on decomposition and bi-level optimization: a master step is in charge of enforcing the constraints, while a learner step takes care of training the model. The process leads to approximate constraint satisfaction. The method is applicable to any ML approach for which the concept of label (or target) is well defined (most regression and classification scenarios), and allows to reuse existing training algorithms with no modifications. We require no assumption on the constraints, although their properties affect the shape and complexity of the master problem. Convergence guarantees are hard to provide, but we found that the approach performs well on ML tasks with fairness constraints and on classical datasets with synthetic constraints.
Injecting Domain Knowledge in Neural Networks: a Controlled Experiment on a Constrained Problem
Silvestri, Mattia, Lombardi, Michele, Milano, Michela
Given enough data, Deep Neural Networks (DNNs) are capable of learning complex input-output relations with high accuracy. In several domains, however, data is scarce or expensive to retrieve, while a substantial amount of expert knowledge is available. It seems reasonable that if we can inject this additional information in the DNN, we could ease the learning process. One such case is that of Constraint Problems, for which declarative approaches exists and pure ML solutions have obtained mixed success. Using a classical constrained problem as a case study, we perform controlled experiments to probe the impact of progressively adding domain and empirical knowledge in the DNN. Our results are very encouraging, showing that (at least in our setup) embedding domain knowledge at training time can have a considerable effect and that a small amount of empirical knowledge is sufficient to obtain practically useful results.
Learning Certified Individually Fair Representations
Ruoss, Anian, Balunović, Mislav, Fischer, Marc, Vechev, Martin
To effectively enforce fairness constraints one needs to define an appropriate notion of fairness and employ representation learning in order to impose this notion without compromising downstream utility for the data consumer. A desirable notion is individual fairness as it guarantees similar treatment for similar individuals. In this work, we introduce the first method which generalizes individual fairness to rich similarity notions via logical constraints while also enabling data consumers to obtain fairness certificates for their models. The key idea is to learn a representation that provably maps similar individuals to latent representations at most $\epsilon$ apart in $\ell_{\infty}$-distance, enabling data consumers to certify individual fairness by proving $\epsilon$-robustness of their classifier. Our experimental evaluation on six real-world datasets and a wide range of fairness constraints demonstrates that our approach is expressive enough to capture similarity notions beyond existing distance metrics while scaling to realistic use cases.
ORCSolver: An Efficient Solver for Adaptive GUI Layout with OR-Constraints
Jiang, Yue, Stuerzlinger, Wolfgang, Zwicker, Matthias, Lutteroth, Christof
OR-constrained (ORC) graphical user interface layouts unify conventional constraint-based layouts with flow layouts, which enables the definition of flexible layouts that adapt to screens with different sizes, orientations, or aspect ratios with only a single layout specification. Unfortunately, solving ORC layouts with current solvers is time-consuming and the needed time increases exponentially with the number of widgets and constraints. To address this challenge, we propose ORCSolver, a novel solving technique for adaptive ORC layouts, based on a branch-and-bound approach with heuristic preprocessing. We demonstrate that ORCSolver simplifies ORC specifications at runtime and our approach can solve ORC layout specifications efficiently at near-interactive rates.
A spatio-temporalisation of ALC(D) and its translation into alternating automata augmented with spatial constraints
The aim of this work is to provide a family of qualitative theories for spatial change in general, and for motion of spatial scenes in particular. To achieve this, we consider a spatio-temporalisation MTALC(Dx), of the well-known ALC(D) family of Description Logics (DLs) with a concrete domain: the MTALC(Dx) concepts are interpreted over infinite k-ary Sigma-trees, with the nodes standing for time points, and Sigma including, additionally to its uses in classical k-ary Sigma-trees, the description of the snapshot of an n-object spatial scene of interest; the roles split into m+n immediate-successor (accessibility) relations, which are serial, irreflexive and antisymmetric, and of which m are general, not necessarily functional, the other n functional; the concrete domain Dx is generated by an RCC8-like spatial Relation Algebra (RA) x, and is used to guide the change by imposing spatial constraints on objects of the "followed" spatial scene, eventually at different time points of the input trees. In order to capture the expressiveness of most modal temporal logics encountered in the literature, we introduce weakly cyclic Terminological Boxes (TBoxes) of MTALC(Dx), whose axioms capture the decreasing property of modal temporal operators. We show the important result that satisfiability of an MTALC(Dx) concept with respect to a weakly cyclic TBox can be reduced to the emptiness problem of a Buchi weak alternating automaton augmented with spatial constraints. In another work, complementary to this one, also submitted to this conference, we thoroughly investigate Buchi automata augmented with spatial constraints, and provide, in particular, a translation of an alternating into a nondeterministic, and an effective decision procedure for the emptiness problem of the latter.
Arc-Consistency computes the minimal binarised domains of an STP. Use of the result in a TCSP solver, in a TCSP-based job shop scheduler, and in generalising Dijkstra's one-to-all algorithm
TCSPs (Temporal Constraint Satisfaction Problems), as defined in [Dechter et al., 1991], get rid of unary constraints by binarising them after having added an "origin of the world" variable. The constraints are therefore exclusively binary; additionally, a TCSP verifies the property that it is node-consistent and arc-consistent. Path-consistency, the next higher local consistency, solves the consistency problem of a convex TCSP, referred to in [Dechter et al., 1991] as an STP (Simple Temporal Problem); more than that, the output of path-consistency applied to an n+1-variable STP is a minimal and strongly n+1-consistent STP. Weaker versions of path-consistency, aimed at avoiding what is referred to in [Schwalb and Dechter, 1997] as the "fragmentation problem", are used as filtering procedures in recursive backtracking algorithms for the consistency problem of a general TCSP. In this work, we look at the constraints between the "origin of the world" variable and the other variables, as the (binarised) domains of these other variables. With this in mind, we define a notion of arc-consistency for TCSPs, which we will refer to as binarised-domains Arc-Consistency, or bdArc-Consistency for short. We provide an algorithm achieving bdArc-Consistency for a TCSP, which we will refer to as bdAC3, for it is an adaptation of Mackworth's [1977] well-known arc-consistency algorithm AC3. We show that bdArc-Consistency computes the minimal (binarised) domains of an STP. We then show how to use the result in a general TCSP solver, in a TCSP-based job shop scheduler, and in generalising the well-known Dijkstra's one-to-all shortest paths algorithm.
Automatic Cost Function Learning with Interpretable Compositional Networks
Richoux, Florian, Baffier, Jean-François
Cost Function Networks (CFN) are a formalism in Constraint Programming to model combinatorial satisfaction or optimization problems. By associating a function to each constraint type to evaluate the quality of an assignment, it extends the expressivity of regular CSP/COP formalisms but at a price of making harder the problem modeling. Indeed, in addition to regular variables/domains/constraints sets, one must provide a set of cost functions that are not always easy to define. Here we propose a method to automatically learn a cost function of a constraint, given a function deciding if assignments are valid or not. This is to the best of our knowledge the first attempt to automatically learn cost functions. Our method aims to learn cost functions in a supervised fashion, trying to reproduce the Hamming distance, by using a variation of neural networks we named Interpretable Compositional Networks, allowing us to get explainable results, unlike regular artificial neural networks. We experiment it on 5 different constraints to show its versatility. Experiments show that functions learned on small dimensions scale on high dimensions, outputting a perfect or near-perfect Hamming distance for most constraints. Our system can be used to automatically generate cost functions and then having the expressivity of CFN with the same modeling effort than for CSP/COP.