Constraint-Based Reasoning
Grid-SiPhyR: An end-to-end learning to optimize framework for combinatorial problems in power systems
Haider, Rabab, Annaswamy, Anuradha M.
Mixed integer problems are ubiquitous in decision making, from discrete device settings and design parameters, unit production, and on/off or yes/no decision in switches, routing, and social networks. Despite their prevalence, classical optimization approaches for combinatorial optimization remain prohibitively slow for fast and accurate decision making in dynamic and safety-critical environments with hard constraints. To address this gap, we propose SiPhyR (pronounced: cipher), a physics-informed machine learning framework for end-to-end learning to optimize for combinatorial problems. SiPhyR employs a novel physics-informed rounding approach to tackle the challenge of combinatorial optimization within a differentiable framework that has certified satisfiability of safety-critical constraints. We demonstrate the effectiveness of SiPhyR on an emerging paradigm for clean energy systems: dynamic reconfiguration, where the topology of the electric grid and power flow are optimized so as to maintain a safe and reliable power grid in the presence of intermittent renewable generation. Offline training of the unsupervised framework on representative load and generation data makes dynamic decision making via the online application of Grid-SiPhyR computationally feasible.
Constrained Proximal Policy Optimization
Xuan, Chengbin, Zhang, Feng, Yin, Faliang, Lam, Hak-Keung
The problem of constrained reinforcement learning (CRL) holds significant importance as it provides a framework for addressing critical safety satisfaction concerns in the field of reinforcement learning (RL). However, with the introduction of constraint satisfaction, the current CRL methods necessitate the utilization of second-order optimization or primal-dual frameworks with additional Lagrangian multipliers, resulting in increased complexity and inefficiency during implementation. To address these issues, we propose a novel first-order feasible method named Constrained Proximal Policy Optimization (CPPO). By treating the CRL problem as a probabilistic inference problem, our approach integrates the Expectation-Maximization framework to solve it through two steps: 1) calculating the optimal policy distribution within the feasible region (E-step), and 2) conducting a first-order update to adjust the current policy towards the optimal policy obtained in the E-step (M-step). We establish the relationship between the probability ratios and KL divergence to convert the E-step into a convex optimization problem. Furthermore, we develop an iterative heuristic algorithm from a geometric perspective to solve this problem. Additionally, we introduce a conservative update mechanism to overcome the constraint violation issue that occurs in the existing feasible region method. Empirical evaluations conducted in complex and uncertain environments validate the effectiveness of our proposed method, as it performs at least as well as other baselines.
Learning Feasibility of Factored Nonlinear Programs in Robotic Manipulation Planning
Ortiz-Haro, Joaquim, Ha, Jung-Su, Driess, Danny, Karpas, Erez, Toussaint, Marc
A factored Nonlinear Program (Factored-NLP) explicitly models the dependencies between a set of continuous variables and nonlinear constraints, providing an expressive formulation for relevant robotics problems such as manipulation planning or simultaneous localization and mapping. When the problem is over-constrained or infeasible, a fundamental issue is to detect a minimal subset of variables and constraints that are infeasible. Previous approaches require solving several nonlinear programs, incrementally adding and removing constraints, and are thus computationally expensive. In this paper, we propose a graph neural architecture that predicts which variables and constraints are jointly infeasible. The model is trained with a dataset of labeled subgraphs of Factored-NLPs, and importantly, can make useful predictions on larger factored nonlinear programs than the ones seen during training. We evaluate our approach in robotic manipulation planning, where our model is able to generalize to longer manipulation sequences involving more objects and robots, and different geometric environments. The experiments show that the learned model accelerates general algorithms for conflict extraction (by a factor of 50) and heuristic algorithms that exploit expert knowledge (by a factor of 4).
Multi-Robot Coordination and Cooperation with Task Precedence Relationships
Gosrich, Walker, Mayya, Siddharth, Narayan, Saaketh, Malencia, Matthew, Agarwal, Saurav, Kumar, Vijay
We propose a new formulation for the multi-robot task planning and allocation problem that incorporates (a) precedence relationships between tasks; (b) coordination for tasks allowing multiple robots to achieve increased efficiency; and (c) cooperation through the formation of robot coalitions for tasks that cannot be performed by individual robots alone. In our formulation, the tasks and the relationships between the tasks are specified by a task graph. We define a set of reward functions over the task graph's nodes and edges. These functions model the effect of robot coalition size on the task performance, and incorporate the influence of one task's performance on a dependent task. Solving this problem optimally is NP-hard. However, using the task graph formulation allows us to leverage min-cost network flow approaches to obtain approximate solutions efficiently. Additionally, we explore a mixed integer programming approach, which gives optimal solutions for small instances of the problem but is computationally expensive. We also develop a greedy heuristic algorithm as a baseline. Our modeling and solution approaches result in task plans that leverage task precedence relationships and robot coordination and cooperation to achieve high mission performance, even in large missions with many agents.
Approximating a RUM from Distributions on k-Slates
Chierichetti, Flavio, Giacchini, Mirko, Kumar, Ravi, Panconesi, Alessandro, Tomkins, Andrew
In this work we consider the problem of fitting Random Utility Models (RUMs) to user choices. Given the winner distributions of the subsets of size $k$ of a universe, we obtain a polynomial-time algorithm that finds the RUM that best approximates the given distribution on average. Our algorithm is based on a linear program that we solve using the ellipsoid method. Given that its corresponding separation oracle problem is NP-hard, we devise an approximate separation oracle that can be viewed as a generalization of the weighted feedback arc set problem to hypergraphs. Our theoretical result can also be made practical: we obtain a heuristic that is effective and scales to real-world datasets.
Interactive Model Expansion in an Observable Environment
Carbonnelle, Pierre, Vennekens, Joost, Bogaerts, Bart, Denecker, Marc
Many practical problems can be understood as the search for a state of affairs that extends a fixed partial state of affairs, the \emph{environment}, while satisfying certain conditions that are formally specified. Such problems are found in, e.g., engineering, law or economics. We study this class of problems in a context where some of the relevant information about the environment is not known by the user at the start of the search. During the search, the user may consider tentative solutions that make implicit hypotheses about these unknowns. To ensure that the solution is appropriate, these hypotheses must be verified by observing the environment. Furthermore, we assume that, in addition to knowledge of what constitutes a solution, knowledge of general laws of the environment is also present. We formally define partial solutions with enough verified facts to guarantee the existence of complete and appropriate solutions. Additionally, we propose an interactive system to assist the user in their search by determining 1) which hypotheses implicit in a tentative solution must be verified in the environment, and 2) which observations can bring useful information for the search. We present an efficient method to over-approximate the set of relevant information, and evaluate our implementation.
BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases
Liu, Xin, Khalifa, Muhammad, Wang, Lu
Energy-based models (EBMs) have gained popularity for controlled text generation due to their high applicability to a wide range of constraints. However, sampling from EBMs is non-trivial, as it often requires a large number of iterations to converge to plausible text, which slows down the decoding process and makes it less practical for real-world applications. In this work, we propose BOLT, which relies on tunable biases to directly adjust the language model's output logits. Unlike prior work, BOLT maintains the generator's autoregressive nature to assert a strong control on token-wise conditional dependencies and overall fluency, and thus converges faster. When compared with state-of-the-arts on controlled generation tasks using both soft constraints (e.g., sentiment control) and hard constraints (e.g., keyword-guided topic control), BOLT demonstrates significantly improved efficiency and fluency. On sentiment control, BOLT is 7x faster than competitive baselines, and more fluent in 74.4% of the evaluation samples according to human judges.
Counterfactuals for Design: A Model-Agnostic Method For Design Recommendations
Regenwetter, Lyle, Obaideh, Yazan Abu, Ahmed, Faez
We introduce Multi-Objective Counterfactuals for Design (MCD), a novel method for counterfactual optimization in design problems. Counterfactuals are hypothetical situations that can lead to a different decision or choice. In this paper, the authors frame the counterfactual search problem as a design recommendation tool that can help identify modifications to a design, leading to better functional performance. MCD improves upon existing counterfactual search methods by supporting multi-objective queries, which are crucial in design problems, and by decoupling the counterfactual search and sampling processes, thus enhancing efficiency and facilitating objective tradeoff visualization. The paper demonstrates MCD's core functionality using a two-dimensional test case, followed by three case studies of bicycle design that showcase MCD's effectiveness in real-world design problems. In the first case study, MCD excels at recommending modifications to query designs that can significantly enhance functional performance, such as weight savings and improvements to the structural safety factor. The second case study demonstrates that MCD can work with a pre-trained language model to suggest design changes based on a subjective text prompt effectively. Lastly, the authors task MCD with increasing a query design's similarity to a target image and text prompt while simultaneously reducing weight and improving structural performance, demonstrating MCD's performance on a complex multimodal query. Overall, MCD has the potential to provide valuable recommendations for practitioners and design automation researchers looking for answers to their ``What if'' questions by exploring hypothetical design modifications and their impact on multiple design objectives. The code, test problems, and datasets used in the paper are available to the public at decode.mit.edu/projects/counterfactuals/.
Constrained Environment Optimization for Prioritized Multi-Agent Navigation
Traditional approaches to the design of multi-agent navigation algorithms consider the environment as a fixed constraint, despite the influence of spatial constraints on agents' performance. Yet hand-designing conducive environment layouts is inefficient and potentially expensive. The goal of this paper is to consider the environment as a decision variable in a system-level optimization problem, where both agent performance and environment cost are incorporated. Towards this end, we propose novel problems of unprioritized and prioritized environment optimization, where the former considers agents unbiasedly and the latter accounts for agent priorities. We show, through formal proofs, under which conditions the environment can change while guaranteeing completeness (i.e., all agents reach goals), and analyze the role of agent priorities in the environment optimization. We proceed to impose real-world constraints on the environment optimization and formulate it mathematically as a constrained stochastic optimization problem. Since the relation between agents, environment and performance is challenging to model, we leverage reinforcement learning to develop a model-free solution and a primal-dual mechanism to handle constraints. Distinct information processing architectures are integrated for various implementation scenarios, including online/offline optimization and discrete/continuous environment. Numerical results corroborate the theory and demonstrate the validity and adaptability of our approach.
Solving Quantum-Inspired Perfect Matching Problems via Tutte's Theorem-Based Hybrid Boolean Constraints
Vardi, Moshe Y., Zhang, Zhiwei
Determining the satisfiability of Boolean constraint-satisfaction problems with different types of constraints, that is hybrid constraints, is a well-studied problem with important applications. We study here a new application of hybrid Boolean constraints, which arises in quantum computing. The problem relates to constrained perfect matching in edge-colored graphs. While general-purpose hybrid constraint solvers can be powerful, we show that direct encodings of the constrained-matching problem as hybrid constraints scale poorly and special techniques are still needed. We propose a novel encoding based on Tutte's Theorem in graph theory as well as optimization techniques. Empirical results demonstrate that our encoding, in suitable languages with advanced SAT solvers, scales significantly better than a number of competing approaches on constrained-matching benchmarks. Our study identifies the necessity of designing problem-specific encodings when applying powerful general-purpose constraint solvers.