Constraint-Based Reasoning
Balancing Constraints and Rewards with Meta-Gradient D4PG
Calian, Dan A., Mankowitz, Daniel J., Zahavy, Tom, Xu, Zhongwen, Oh, Junhyuk, Levine, Nir, Mann, Timothy
Deploying Reinforcement Learning (RL) agents to solve real-world applications often requires satisfying complex system constraints. Often the constraint thresholds are incorrectly set due to the complex nature of a system or the inability to verify the thresholds offline (e.g, no simulator or reasonable offline evaluation procedure exists). This results in solutions where a task cannot be solved without violating the constraints. However, in many real-world cases, constraint violations are undesirable yet they are not catastrophic, motivating the need for soft-constrained RL approaches. We present two soft-constrained RL approaches that utilize meta-gradients to find a good trade-off between expected return and minimizing constraint violations. We demonstrate the effectiveness of these approaches by showing that they consistently outperform the baselines across four different Mujoco domains.
SYMPAIS: SYMbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis
Luo, Yicheng, Filieri, Antonio, Zhou, Yuan
Probabilistic software analysis extends classic static analyses techniques to consider the effects of probabilistic uncertainty, whether explicitly embedded within the code - as in probabilistic program - or externalized in a probabilistic input distribution [Dwyer et al. 2016]. Similarly to their classic counterparts, these analyses aim at inferring the probability of a target event to occur during execution, e.g., reaching a program state or triggering an exception. For the probabilistic analysis of programs written in general-purpose programming languages, probabilistic symbolic execution (PSE) [Geldenhuys et al. 2012] exploits established symbolic execution engines for the language to extract constraints on probabilistic input or program variables that lead to the occurrence of the target event. The probability of satisfying any such constraints is then computed via model counting or inferred via solution space quantification methods, depending on the variable types and the characteristic of the constraints. Variations of PSE include incomplete analyses inferring probability bounds from a finite sample of program paths executed symbolically [Filieri et al. 2014], methods for non-deterministic programs [Luckow et al. 2014] and data structures [Filieri et al. 2015].
Investigating Constraint Relationship in Evolutionary Many-Constraint Optimization
Ming, Mengjun, Wang, Rui, Zhang, Tao
This paper contributes to the treatment of extensive constraints in evolutionary many-constraint optimization through consideration of the relationships between pair-wise constraints. In a conflicting relationship, the functional value of one constraint increases as the value in another constraint decreases. In a harmonious relationship, the improvement in one constraint is rewarded with simultaneous improvement in the other constraint. In an independent relationship, the adjustment to one constraint never affects the adjustment to the other. Based on the different features, methods for identifying constraint relationships are discussed, helping to simplify many-constraint optimization problems (MCOPs). Additionally, the transitivity of the relationships is further discussed at the aim of determining the relationship in a new pair of constraints.
Bias and Variance of Post-processing in Differential Privacy
Zhu, Keyu, Van Hentenryck, Pascal, Fioretto, Ferdinando
Post-processing immunity is a fundamental property of differential privacy: it enables the application of arbitrary data-independent transformations to the results of differentially private outputs without affecting their privacy guarantees. When query outputs must satisfy domain constraints, post-processing can be used to project the privacy-preserving outputs onto the feasible region. Moreover, when the feasible region is convex, a widely adopted class of post-processing steps is also guaranteed to improve accuracy. Post-processing has been applied successfully in many applications including census data-release, energy systems, and mobility. However, its effects on the noise distribution is poorly understood: It is often argued that post-processing may introduce bias and increase variance. This paper takes a first step towards understanding the properties of post-processing. It considers the release of census data and examines, both theoretically and empirically, the behavior of a widely adopted class of post-processing functions.
Projection-Based Constrained Policy Optimization
Yang, Tsung-Yen, Rosca, Justinian, Narasimhan, Karthik, Ramadge, Peter J.
We consider the problem of learning control policies that optimize a reward function while satisfying constraints due to considerations of safety, fairness, or other costs. We propose a new algorithm, Projection-Based Constrained Policy Optimization (PCPO). This is an iterative method for optimizing policies in a two-step process: the first step performs a local reward improvement update, while the second step reconciles any constraint violation by projecting the policy back onto the constraint set. We theoretically analyze PCPO and provide a lower bound on reward improvement, and an upper bound on constraint violation, for each policy update. We further characterize the convergence of PCPO based on two different metrics: $\normltwo$ norm and Kullback-Leibler divergence. Our empirical results over several control tasks demonstrate that PCPO achieves superior performance, averaging more than 3.5 times less constraint violation and around 15\% higher reward compared to state-of-the-art methods.
A New Basis for Spreadsheet Computing: Interval Solver for Microsoft Excel
There is a fundamental mismatch between the computational basis of spreadsheets and our knowledge of the real world. In spreadsheets, numeric data are represented as exact numbers and their mutual relations as functions, whose values (output) are computed from given argument values (input). However, in the real world, data are often inexact and uncertain in many ways, and the relationships, that is, constraints, between input and output are far more complicated. This article shows that interval constraint solving, an emerging AI-based technology, provides a more versatile and useful foundation for spreadsheets. The new computational basis is 100-percent downward compatible with the traditional spreadsheet paradigm.
Optimizing Limousine Service with AI
A common problem for companies with strong business growth is that it is hard to find enough experienced staff to support expansion needs. This problem is particular pronounced for operations planners and controllers who must be very highly knowledgeable and experienced with the business domain. This article is a case study of how one of the largest travel agencies in Hong Kong alleviated this problem by using AI to support decision-making and problem-solving so that their planners and controllers can work more effectively and efficiently to sustain business growth while maintaining consistent quality of service. AI is used in a mission critical fleet management system (FMS) that supports the scheduling and management of a fleet of luxury limousines for business travelers. The AI problem was modeled as a constraint satisfaction problem (CSP).
Using Global Constraints to Automate Regression Testing
Nowadays, any communicating or autonomous systems rely on high-quality software-based components. To ensure a sufficient level of quality, these components must be thoroughly verified before being released and being deployed in operational settings. Regression testing is a crucial verification process that executes any new release of a software-based component against previous versions of the component, with existing test cases. However, the selection of test cases in regression testing is challenging as the time available for testing is limited and some selection criteria must be respected. This problem, coined as Test Suite Reduction (TSR), is usually addressed by validation engineers through manual analysis or by using approximation techniques.
Plan Optimization to Bilingual Dictionary Induction for Low-Resource Language Families
Nasution, Arbi Haza, Murakami, Yohei, Ishida, Toru
Creating bilingual dictionary is the first crucial step in enriching low-resource languages. Especially for the closely-related ones, it has been shown that the constraint-based approach is useful for inducing bilingual lexicons from two bilingual dictionaries via the pivot language. However, if there are no available machine-readable dictionaries as input, we need to consider manual creation by bilingual native speakers. To reach a goal of comprehensively create multiple bilingual dictionaries, even if we already have several existing machine-readable bilingual dictionaries, it is still difficult to determine the execution order of the constraint-based approach to reducing the total cost. Plan optimization is crucial in composing the order of bilingual dictionaries creation with the consideration of the methods and their costs. We formalize the plan optimization for creating bilingual dictionaries by utilizing Markov Decision Process (MDP) with the goal to get a more accurate estimation of the most feasible optimal plan with the least total cost before fully implementing the constraint-based bilingual lexicon induction. We model a prior beta distribution of bilingual lexicon induction precision with language similarity and polysemy of the topology as $\alpha$ and $\beta$ parameters. It is further used to model cost function and state transition probability. We estimated the cost of all investment plan as a baseline for evaluating the proposed MDP-based approach with total cost as an evaluation metric. After utilizing the posterior beta distribution in the first batch of experiments to construct the prior beta distribution in the second batch of experiments, the result shows 61.5\% of cost reduction compared to the estimated all investment plan and 39.4\% of cost reduction compared to the estimated MDP optimal plan. The MDP-based proposal outperformed the baseline on the total cost.