constraint loss
An Analysis under a Unified Fomulation of Learning Algorithms with Output Constraints
Neural networks (NN) perform well in diverse tasks, but sometimes produce nonsensical results to humans. Most NN models "solely" learn from (input, output) pairs, occasionally conflicting with human knowledge. Many studies indicate injecting human knowledge by reducing output constraints during training can improve model performance and reduce constraint violations. While there have been several attempts to compare different existing algorithms under the same programming framework, nonetheless, there has been no previous work that categorizes learning algorithms with output constraints in a unified manner. Our contributions are as follows: (1) We categorize the previous studies based on three axes: type of constraint loss used (e.g. probabilistic soft logic, REINFORCE), exploration strategy of constraint-violating examples, and integration mechanism of learning signals from main task and constraint. (2) We propose new algorithms to integrate the information of main task and constraint injection, inspired by continual-learning algorithms. (3) Furthermore, we propose the $H\beta$-score as a metric for considering the main task metric and constraint violation simultaneously. To provide a thorough analysis, we examine all the algorithms on three NLP tasks: natural language inference (NLI), synthetic transduction examples (STE), and semantic role labeling (SRL). We explore and reveal the key factors of various algorithms associated with achieving high $H\beta$-scores.
- Asia > South Korea > Seoul > Seoul (0.04)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- North America > Dominican Republic (0.04)
- (2 more...)
Guaranteed Conformance of Neurosymbolic Models to Natural Constraints
Sridhar, Kaustubh, Dutta, Souradeep, Weimer, James, Lee, Insup
Deep neural networks have emerged as the workhorse for a large section of robotics and control applications, especially as models for dynamical systems. Such data-driven models are in turn used for designing and verifying autonomous systems. They are particularly useful in modeling medical systems where data can be leveraged to individualize treatment. In safety-critical applications, it is important that the data-driven model is conformant to established knowledge from the natural sciences. Such knowledge is often available or can often be distilled into a (possibly black-box) model. For instance, an F1 racing car should conform to Newton's laws (which are encoded within a unicycle model). In this light, we consider the following problem - given a model $M$ and a state transition dataset, we wish to best approximate the system model while being a bounded distance away from $M$. We propose a method to guarantee this conformance. Our first step is to distill the dataset into a few representative samples called memories, using the idea of a growing neural gas. Next, using these memories we partition the state space into disjoint subsets and compute bounds that should be respected by the neural network in each subset. This serves as a symbolic wrapper for guaranteed conformance. We argue theoretically that this only leads to a bounded increase in approximation error; which can be controlled by increasing the number of memories. We experimentally show that on three case studies (Car Model, Drones, and Artificial Pancreas), our constrained neurosymbolic models conform to specified models (each encoding various constraints) with order-of-magnitude improvements compared to the augmented Lagrangian and vanilla training methods. Our code can be found at: https://github.com/kaustubhsridhar/Constrained_Models
- North America > United States > Pennsylvania (0.04)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)
- Health & Medicine (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)
- Government > Military (0.68)
- Leisure & Entertainment > Sports > Motorsports > Formula One (0.54)
Learning to Solve Constraint Satisfaction Problems with Recurrent Transformer
Yang, Zhun, Ishay, Adam, Lee, Joohyung
Constraint satisfaction problems (CSPs) are about finding values of variables that satisfy the given constraints. We show that Transformer extended with recurrence is a viable approach to learning to solve CSPs in an end-to-end manner, having clear advantages over state-of-the-art methods such as Graph Neural Networks, SATNet, and some neuro-symbolic models. With the ability of Transformer to handle visual input, the proposed Recurrent Transformer can straightforwardly be applied to visual constraint reasoning problems while successfully addressing the symbol grounding problem. We also show how to leverage deductive knowledge of discrete constraints in the Transformer's inductive learning to achieve sampleefficient learning and semi-supervised learning for CSPs. Constraint Satisfaction Problems (CSPs) are about finding values of variables that satisfy given constraints. They have been widely studied in symbolic AI with an emphasis on designing efficient algorithms to deductively find solutions for explicitly stated constraints. In the recent deep learningbased approach, the focus is on inductively learning the constraints and solving them in an end-to-end manner. For example, the Recurrent Relational Network (RRN) (Palm et al., 2018) uses message passing over graph structures to learn logical constraints, achieving high accuracy in textual Sudoku. On the other hand, it uses hand-coded information about Sudoku constraints, namely, which variables are allowed to interact. Moreover, it is limited to textual input. SATNet (Wang et al., 2019) is a differentiable MAXSAT solver that can infer logical rules and can be integrated into DNNs.
CGDTest: A Constrained Gradient Descent Algorithm for Testing Neural Networks
Nagisetty, Vineel, Graves, Laura, Pan, Guanting, Jha, Piyush, Ganesh, Vijay
In this paper, we propose a new Deep Neural Network (DNN) testing algorithm called the Constrained Gradient Descent (CGD) method, and an implementation we call CGDTest aimed at exposing security and robustness issues such as adversarial robustness and bias in DNNs. Our CGD algorithm is a gradient-descent (GD) method, with the twist that the user can also specify logical properties that characterize the kinds of inputs that the user may want. This functionality sets CGDTest apart from other similar DNN testing tools since it allows users to specify logical constraints to test DNNs not only for $\ell_p$ ball-based adversarial robustness but, more importantly, includes richer properties such as disguised and flow adversarial constraints, as well as adversarial robustness in the NLP domain. We showcase the utility and power of CGDTest via extensive experimentation in the context of vision and NLP domains, comparing against 32 state-of-the-art methods over these diverse domains. Our results indicate that CGDTest outperforms state-of-the-art testing tools for $\ell_p$ ball-based adversarial robustness, and is significantly superior in testing for other adversarial robustness, with improvements in PAR2 scores of over 1500% in some cases over the next best tool. Our evaluation shows that our CGD method outperforms competing methods we compared against in terms of expressibility (i.e., a rich constraint language and concomitant tool support to express a wide variety of properties), scalability (i.e., can be applied to very large real-world models with up to 138 million parameters), and generality (i.e., can be used to test a plethora of model architectures).
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Italy (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Property-driven Training: All You (N)Ever Wanted to Know About
Casadio, Marco, Daggitt, Matthew, Komendantskaya, Ekaterina, Kokke, Wen, Kienitz, Daniel, Stewart, Rob
Neural networks are known for their ability to detect general patterns in noisy data. This makes them a popular tool for perception components in complex AI systems. Paradoxically, they are also known for being vulnerable to adversarial attacks. In response, various methods such as adversarial training, data-augmentation and Lipschitz robustness training have been proposed as means of improving their robustness. However, as this paper explores, these training methods each optimise for a different definition of robustness. We perform an in-depth comparison of these different definitions, including their relationship, assumptions, interpretability and verifiability after training. We also look at constraint-driven training, a general approach designed to encode arbitrary constraints, and show that not all of these definitions are directly encodable. Finally we perform experiments to compare the applicability and efficacy of the training methods at ensuring the network obeys these different definitions. These results highlight that even the encoding of such a simple piece of knowledge such as robustness in neural network training is fraught with difficult choices and pitfalls.
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Asia > Middle East > Jordan (0.04)
- Government (0.34)
- Information Technology > Security & Privacy (0.34)
Can Domain Knowledge Alleviate Adversarial Attacks in Multi-Label Classifiers?
Melacci, Stefano, Ciravegna, Gabriele, Sotgiu, Angelo, Demontis, Ambra, Biggio, Battista, Gori, Marco, Roli, Fabio
Adversarial attacks on machine learning-based classifiers, along with defense mechanisms, have been widely studied in the context of single-label classification problems. In this paper, we shift the attention to multi-label classification, where the availability of domain knowledge on the relationships among the considered classes may offer a natural way to spot incoherent predictions, i.e., predictions associated to adversarial examples lying outside of the training data distribution. We explore this intuition in a framework in which first-order logic knowledge is converted into constraints and injected into a semi-supervised learning problem. Within this setting, the constrained classifier learns to fulfill the domain knowledge over the marginal distribution, and can naturally reject samples with incoherent predictions. Even though our method does not exploit any knowledge of attacks during training, our experimental analysis surprisingly unveils that domain-knowledge constraints can help detect adversarial examples effectively, especially if such constraints are not known to the attacker. While we also show that an adaptive attack exploiting knowledge of the constraints may still deceive our classifier, it remains an open issue to understand how hard for an attacker would be to infer such constraints in practical cases. For this reason, we believe that our approach may provide a significant step towards designing robust multi-label classifiers.
- Europe > Italy > Sardinia > Cagliari (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (2 more...)
- Information Technology > Security & Privacy (1.00)
- Government (0.71)
Elaborating on Learned Demonstrations with Temporal Logic Specifications
Innes, Craig, Ramamoorthy, Subramanian
Most current methods for learning from demonstrations assume that those demonstrations alone are sufficient to learn the underlying task. This is often untrue, especially if extra safety specifications exist which were not present in the original demonstrations. In this paper, we allow an expert to elaborate on their original demonstration with additional specification information using linear temporal logic (LTL). Our system converts LTL specifications into a differentiable loss. This loss is then used to learn a dynamic movement primitive that satisfies the underlying specification, while remaining close to the original demonstration. Further, by leveraging adversarial training, our system learns to robustly satisfy the given LTL specification on unseen inputs, not just those seen in training. We show that our method is expressive enough to work across a variety of common movement specification patterns such as obstacle avoidance, patrolling, keeping steady, and speed limitation. In addition, we show that our system can modify a base demonstration with complex specifications by incrementally composing multiple simpler specifications. We also implement our system on a PR-2 robot to show how a demonstrator can start with an initial (sub-optimal) demonstration, then interactively improve task success by including additional specifications enforced with our differentiable LTL loss.
Escaping from Collapsing Modes in a Constrained Space
Chang, Chia-Che, Lin, Chieh Hubert, Lee, Che-Rung, Juan, Da-Cheng, Wei, Wei, Chen, Hwann-Tzong
Generative adversarial networks (GANs) often suffer from unpredictable mode-collapsing during training. We study the issue of mode collapse of Boundary Equilibrium Generative Adversarial Network (BEGAN), which is one of the state-of-the-art generative models. Despite its potential of generating high-quality images, we find that BEGAN tends to collapse at some modes after a period of training. We propose a new model, called \emph{BEGAN with a Constrained Space} (BEGAN-CS), which includes a latent-space constraint in the loss function. We show that BEGAN-CS can significantly improve training stability and suppress mode collapse without either increasing the model complexity or degrading the image quality. Further, we visualize the distribution of latent vectors to elucidate the effect of latent-space constraint. The experimental results show that our method has additional advantages of being able to train on small datasets and to generate images similar to a given real image yet with variations of designated attributes on-the-fly.
- North America > United States > Hawaii > Honolulu County > Honolulu (0.05)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- (8 more...)
Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints
Cotter, Andrew, Gupta, Maya, Jiang, Heinrich, Srebro, Nathan, Sridharan, Karthik, Wang, Serena, Woodworth, Blake, You, Seungil
Classifiers can be trained with data-dependent constraints to satisfy fairness goals, reduce churn, achieve a targeted false positive rate, or other policy goals. We study the generalization performance for such constrained optimization problems, in terms of how well the constraints are satisfied at evaluation time, given that they are satisfied at training time. To improve generalization performance, we frame the problem as a two-player game where one player optimizes the model parameters on a training dataset, and the other player enforces the constraints on an independent validation dataset. We build on recent work in two-player constrained optimization to show that if one uses this two-dataset approach, then constraint generalization can be significantly improved. As we illustrate experimentally, this approach works not only in theory, but also in practice.
- North America > United States > New York (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)