Constraint-Based Reasoning
Reviews: Constraints Based Convex Belief Propagation
General comments: (i) The authors only solve a new special kind of higher order consistency constraints, generalizing soft PN-potentials, but not a truly general class of constraints, as indicated in the title or in the abstract. In case of MAP-inference, which is normally desired, the goal is to obtain a single assignment which satisfies all given linear constraints. The relaxed model the authors optimize is simply a byproduct of looking for marginals instead of MAP-assignments (the added entropy is responsible for this). In case of vanishing entropy one gets the same model. Hence there certainly remains the disadvantage of a parameter in the PN-potential, but now hidden in the entropy.
Perceptual Kalman Filters: Online State Estimation under a Perfect Perceptual-Quality Constraint
Many practical settings call for the reconstruction of temporal signals from corrupted or missing data. Classic examples include decoding, tracking, signal enhancement and denoising. Since the reconstructed signals are ultimately viewed by humans, it is desirable to achieve reconstructions that are pleasing to human perception.Mathematically, perfect perceptual-quality is achieved when the distribution of restored signals is the same as that of natural signals, a requirement which has been heavily researched in static estimation settings (i.e. when a whole signal is processed at once). Here, we study the problem of optimal causal filtering under a perfect perceptual-quality constraint, which is a task of fundamentally different nature. Specifically, we analyze a Gaussian Markov signal observed through a linear noisy transformation.
Online Learning under Adversarial Nonlinear Constraints
In many applications, learning systems are required to process continuous non-stationary data streams.We study this problem in an online learning framework and propose an algorithm that can deal with adversarial time-varying and nonlinear constraints.As we show in our work, the algorithm called Constraint Violation Velocity Projection (CVV-Pro) achieves \sqrt{T} regret and converges to the feasible set at a rate of 1/\sqrt{T}, despite the fact that the feasible set is slowly time-varying and a priori unknown to the learner. CVV-Pro only relies on local sparse linear approximations of the feasible set and therefore avoids optimizing over the entire set at each iteration, which is in sharp contrast to projected gradients or Frank-Wolfe methods. We also empirically evaluate our algorithm on two-player games, where the players are subjected to a shared constraint.
Reduced Policy Optimization for Continuous Control with Hard Constraints
Recent advances in constrained reinforcement learning (RL) have endowed reinforcement learning with certain safety guarantees. However, deploying existing constrained RL algorithms in continuous control tasks with general hard constraints remains challenging, particularly in those situations with non-convex hard constraints. Inspired by the generalized reduced gradient (GRG) algorithm, a classical constrained optimization technique, we propose a reduced policy optimization (RPO) algorithm that combines RL with GRG to address general hard constraints. Subsequently, RPO calculates the nonbasic actions by solving equations based on equality constraints using the obtained basic actions. The policy network is then updated by implicitly differentiating nonbasic actions with respect to basic actions.
Online Convex Optimization with Hard Constraints: Towards the Best of Two Worlds and Beyond
This paper considers online convex optimization with hard constraints and analyzes achievable regret and cumulative hard constraint violation (violation for short). The problem distinguishes itself from online convex optimization with soft constraints, where a violation at one round can be compensated/cancelled by a conservative decision at a different round. We propose a RECtified Online Optimization algorithm (RECOO) and consider two settings: fixed constraints and adversarial constraints. Both settings have been considered in the literature. Compared with existing results, {\em RECOO achieves the best of two worlds and beyond.}
An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints
This paper considers stochastic linear bandits with general nonlinear constraints. The objective is to maximize the expected cumulative reward over horizon T subject to a set of constraints in each round \tau\leq T . We propose a pessimistic-optimistic algorithm for this problem, which is efficient in two aspects. First, the algorithm yields \tilde{\cal O}\left(\left(\frac{K {0.75}}{\delta} d\right)\sqrt{\tau}\right) (pseudo) regret in round \tau\leq T, where K is the number of constraints, d is the dimension of the reward feature space, and \delta is a Slater's constant; and {\em zero} constraint violation in any round \tau \tau', where \tau' is {\em independent} of horizon T. Second, the algorithm is computationally efficient. Our algorithm is based on the primal-dual approach in optimization and includes two components.
Complex Query Answering on Eventuality Knowledge Graph with Implicit Logical Constraints
Querying knowledge graphs (KGs) using deep learning approaches can naturally leverage the reasoning and generalization ability to learn to infer better answers. Traditional neural complex query answering (CQA) approaches mostly work on entity-centric KGs. However, in the real world, we also need to make logical inferences about events, states, and activities (i.e., eventualities or situations) to push learning systems from System I to System II, as proposed by Yoshua Bengio. Querying logically from an EVentuality-centric KG (EVKG) can naturally provide references to such kind of intuitive and logical inference. Thus, in this paper, we propose a new framework to leverage neural methods to answer complex logical queries based on an EVKG, which can satisfy not only traditional first-order logic constraints but also implicit logical constraints over eventualities concerning their occurrences and orders. For instance, if we know that Food is bad happens before PersonX adds soy sauce, then PersonX adds soy sauce is unlikely to be the cause of Food is bad due to implicit temporal constraint.
A Surrogate Objective Framework for Prediction+Programming with Soft Constraints
Prediction optimization is a common real-world paradigm where we have to predict problem parameters before solving the optimization problem. However, the criteria by which the prediction model is trained are often inconsistent with the goal of the downstream optimization problem. Recently, decision-focused prediction approaches, such as SPO and direct optimization, have been proposed to fill this gap. However, they cannot directly handle the soft constraints with the max operator required in many real-world objectives. This paper proposes a novel analytically differentiable surrogate objective framework for real-world linear and semi-definite negative quadratic programming problems with soft linear and non-negative hard constraints. This framework gives the theoretical bounds on constraints' multipliers, and derives the closed-form solution with respect to predictive parameters and thus gradients for any variable in the problem.
Semantic Probabilistic Layers for Neuro-Symbolic Learning
We design a predictive layer for structured-output prediction (SOP) that can be plugged into any neural network guaranteeing its predictions are consistent with a set of predefined symbolic constraints. Our Semantic Probabilistic Layer (SPL) can model intricate correlations, and hard constraints, over a structured output space all while being amenable to end-to-end learning via maximum likelihood.SPLs combine exact probabilistic inference with logical reasoning in a clean and modular way, learning complex distributions and restricting their support to solutions of the constraint. We empirically demonstrate that SPLs outperform these competitors in terms of accuracy on challenging SOP tasks such as hierarchical multi-label classification, pathfinding and preference learning, while retaining perfect constraint satisfaction.
Deep Attentive Belief Propagation: Integrating Reasoning and Learning for Solving Constraint Optimization Problems
Belief Propagation (BP) is an important message-passing algorithm for various reasoning tasks over graphical models, including solving the Constraint Optimization Problems (COPs). It has been shown that BP can achieve state-of-the-art performance on various benchmarks by mixing old and new messages before sending the new one, i.e., damping. However, existing methods on tuning a static damping factor for BP not only is laborious but also harms their performance. Moreover, existing BP algorithms treat each variable node's neighbors equally when composing a new message, which also limits their exploration ability. To address these issues, we seamlessly integrate BP, Gated Recurrent Units (GRUs), and Graph Attention Networks (GATs) within the massage-passing framework to reason about dynamic weights and damping factors for composing new BP messages. Our model, Deep Attentive Belief Propagation (DABP), takes the factor graph and the BP messages in each iteration as the input and infers the optimal weights and damping factors through GRUs and GATs, followed by a multi-head attention layer.