logic constraint
NeuralRule-ExecutionTrackingMachineFor Transformer-BasedTextGeneration
Sequence-to-Sequence (Seq2Seq) neural text generation models, especially the pre-trained ones (e.g., BART and T5), have exhibited compelling performance on various natural language generation tasks. However,the black-box nature of these models limits their application in tasks where specific rules (e.g., controllable constraints, prior knowledge) need to be executed.
ILCL: Inverse Logic-Constraint Learning from Temporally Constrained Demonstrations
Cho, Minwoo, Jang, Jaehwi, Park, Daehyung
We aim to solve the problem of temporal-constraint learning from demonstrations to reproduce demonstration-like logic-constrained behaviors. Learning logic constraints is challenging due to the combinatorially large space of possible specifications and the ill-posed nature of non-Markovian constraints. To figure it out, we introduce a novel temporal-constraint learning method, which we call inverse logic-constraint learning (ILCL). Our method frames ICL as a two-player zero-sum game between 1) a genetic algorithm-based temporal-logic mining (GA-TL-Mining) and 2) logic-constrained reinforcement learning (Logic-CRL). GA-TL-Mining efficiently constructs syntax trees for parameterized truncated linear temporal logic (TLTL) without predefined templates. Subsequently, Logic-CRL finds a policy that maximizes task rewards under the constructed TLTL constraints via a novel constraint redistribution scheme. Our evaluations show ILCL outperforms state-of-the-art baselines in learning and transferring TL constraints on four temporally constrained tasks. We also demonstrate successful transfer to real-world peg-in-shallow-hole tasks.
Semantic Objective Functions: A distribution-aware method for adding logical constraints in deep learning
Mendez-Lucero, Miguel Angel, Gallardo, Enrique Bojorquez, Belle, Vaishak
Issues of safety, explainability, and efficiency are of increasing concern in learning systems deployed with hard and soft constraints. Symbolic Constrained Learning and Knowledge Distillation techniques have shown promising results in this area, by embedding and extracting knowledge, as well as providing logical constraints during neural network training. Although many frameworks exist to date, through an integration of logic and information geometry, we provide a construction and theoretical framework for these tasks that generalize many approaches. We propose a loss-based method that embeds knowledge--enforces logical constraints--into a machine learning model that outputs probability distributions. This is done by constructing a distribution from the external knowledge/logic formula, and constructing a loss function as a linear combination of the original loss function with the Fisher-Rao distance or Kullback-Leibler divergence to the constraint distribution. This construction includes logical constraints in the form of propositional formulas (Boolean variables), formulas of a first-order language with finite variables over a model with compact domain (categorical and continuous variables), and in general,likely applicable to any statistical model that was pretrained with semantic information. We evaluate our method on a variety of learning tasks, including classification tasks with logic constraints, transferring knowledge from logic formulas, and knowledge distillation from general distributions.
Multitask Kernel-based Learning with Logic Constraints
Diligenti, Michelangelo, Gori, Marco, Maggini, Marco, Rigutini, Leonardo
This paper presents a general framework to integrate prior knowledge in the form of logic constraints among a set of task functions into kernel machines. The logic propositions provide a partial representation of the environment, in which the learner operates, that is exploited by the learning algorithm together with the information available in the supervised examples. In particular, we consider a multi-task learning scheme, where multiple unary predicates on the feature space are to be learned by kernel machines and a higher level abstract representation consists of logic clauses on these predicates, known to hold for any input. A general approach is presented to convert the logic clauses into a continuous implementation, that processes the outputs computed by the kernel-based predicates. The learning task is formulated as a primal optimization problem of a loss function that combines a term measuring the fitting of the supervised examples, a regularization term, and a penalty term that enforces the constraints on both supervised and unsupervised examples. The proposed semi-supervised learning framework is particularly suited for learning in high dimensionality feature spaces, where the supervised training examples tend to be sparse and generalization difficult. Unlike for standard kernel machines, the cost function to optimize is not generally guaranteed to be convex. However, the experimental results show that it is still possible to find good solutions using a two stage learning schema, in which first the supervised examples are learned until convergence and then the logic constraints are forced. Some promising experimental results on artificial multi-task learning tasks are reported, showing how the classification accuracy can be effectively improved by exploiting the a priori rules and the unsupervised examples.
Logic Constraints to Feature Importances
Picchiotti, Nicola, Gori, Marco
In recent years, Artificial Intelligence (AI) algorithms have been proven to outperform traditional statistical methods in terms of predictivity, especially when a large amount of data was available. Nevertheless, the "black box" nature of AI models is often a limit for a reliable application in high-stakes fields like diagnostic techniques, autonomous guide, etc. Recent works have shown that an adequate level of interpretability could enforce the more general concept of model trustworthiness. The basic idea of this paper is to exploit the human prior knowledge of the features' importance for a specific task, in order to coherently aid the phase of the model's fitting. This sort of "weighted" AI is obtained by extending the empirical loss with a regularization term encouraging the importance of the features to follow predetermined constraints. This procedure relies on local methods for the feature importance computation, e.g. LRP, LIME, etc. that are the link between the model weights to be optimized and the user-defined constraints on feature importance. In the fairness area, promising experimental results have been obtained for the Adult dataset. Many other possible applications of this model agnostic theoretical framework are described.
Asynchronous Distributed Learning from Constraints
Farina, Francesco, Melacci, Stefano, Garulli, Andrea, Giannitrapani, Antonio
In this paper, the extension of the framework of Learning from Constraints (LfC) to a distributed setting where multiple parties, connected over the network, contribute to the learning process is studied. LfC relies on the generic notion of "constraint" to inject knowledge into the learning problem and, due to its generality, it deals with possibly nonconvex constraints, enforced either in a hard or soft way. Motivated by recent progresses in the field of distributed and constrained nonconvex optimization, we apply the (distributed) Asynchronous Method of Multipliers (ASYMM) to LfC. The study shows that such a method allows us to support scenarios where selected constraints (i.e., knowledge), data, and outcomes of the learning process can be locally stored in each computational node without being shared with the rest of the network, opening the road to further investigations into privacy-preserving LfC. Constraints act as a bridge between what is shared over the net and what is private to each node and no central authority is required. We demonstrate the applicability of these ideas in two distributed real-world settings in the context of digit recognition and document classification.
Trusted Neural Networks for Safety-Constrained Autonomous Control
Ghosh, Shalini, Mercier, Amaury, Pichapati, Dheeraj, Jha, Susmit, Yegneswaran, Vinod, Lincoln, Patrick
We propose Trusted Neural Network (TNN) models, which are deep neural network models that satisfy safety constraints critical to the application domain. We investigate different mechanisms for incorporating rule-based knowledge in the form of first-order logic constraints into a TNN model, where rules that encode safety are accompanied by weights indicating their relative importance. This framework allows the TNN model to learn from knowledge available in form of data as well as logical rules. We propose multiple approaches for solving this problem: (a) a multi-headed model structure that allows trade-off between satisfying logical constraints and fitting training data in a unified training framework, and (b) creating a constrained optimization problem and solving it in dual formulation by posing a new constrained loss function and using a proximal gradient descent algorithm. We demonstrate the efficacy of our TNN framework through experiments using the open-source TORCS~\cite{BernhardCAA15} 3D simulator for self-driving cars. Experiments using our first approach of a multi-headed TNN model, on a dataset generated by a customized version of TORCS, show that (1) adding safety constraints to a neural network model results in increased performance and safety, and (2) the improvement increases with increasing importance of the safety constraints. Experiments were also performed using the second approach of proximal algorithm for constrained optimization --- they demonstrate how the proposed method ensures that (1) the overall TNN model satisfies the constraints even when the training data violates some of the constraints, and (2) the proximal gradient descent algorithm on the constrained objective converges faster than the unconstrained version.
Joint Structured Learning and Predictions under Logical Constraints in Conditional Random Fields
This paper is concerned with structured machine learning, in a supervised machine learning context. It discusses how to make joint structured learning on interdependent objects of different nature, as well as how to enforce logical constraints when predicting labels. We explain how this need arose in a Document Understanding task. We then discuss a general extension to Conditional Random Fields (CRF) for this purpose and present the contributed open source implementation on top of the open source PyStruct library. We evaluate its performance on a publicly available dataset. Keywords: supervised machine learning, structured prediction, conditional random fields.