Hoang, Khoi D. | Fioretto, Ferdinando (Syracuse University) | Hou, Ping (Uber Advanced Technologies Group) | Yeoh, William (Washington University in St. Louis) | Yokoo, Makoto (Kyushu University) | Zivan, Roie (Ben-Gurion University of the Negev)
The Distributed Constraint Optimization Problem (DCOP) formulation is a powerful tool for modeling multi-agent coordination problems. To solve DCOPs in a dynamic environment, Dynamic DCOPs (D-DCOPs) have been proposed to model the inherent dynamism present in many coordination problems. D-DCOPs solve a sequence of static problems by reacting to changes in the environment as the agents observe them. Such reactive approaches ignore knowledge about future changes of the problem. To overcome this limitation, we introduce Proactive Dynamic DCOPs (PD-DCOPs), a novel formalism to model D-DCOPs in the presence of exogenous uncertainty. In contrast to reactive approaches, PD-DCOPs are able to explicitly model possible changes of the problem and take such information into account when solving the dynamically changing problem in a proactive manner. The additional expressivity of this formalism allows it to model a wider variety of distributed optimization problems. Our work presents both theoretical and practical contributions that advance current dynamic DCOP models: (i) We introduce Proactive Dynamic DCOPs (PD-DCOPs), which explicitly model how the DCOP will change over time; (ii) We develop exact and heuristic algorithms to solve PD-DCOPs in a proactive manner; (iii) We provide theoretical results about the complexity of this new class of DCOPs; and (iv) We empirically evaluate both proactive and reactive algorithms to determine the trade-offs between the two classes. The final contribution is important as our results are the first that identify the characteristics of the problems that the two classes of algorithms excel in.
A computer-aided process planning system should ideally generate and optimize process plans to ensure the application of good manufacturing practices and maintain the consistency of the desired functional specifications of a part during its production processes. Crucial processes, such as selecting machining resources, determining set-up plans and sequencing operations of a part should be considered simultaneously to achieve global optimal solutions. In this paper, these processes are integrated and modelled as a constraint-based optimization problem, and a tabu search-based approach is proposed to solve it effectively. In the optimization model, costs of the utilized machines and cutting tools, machine changes, tool changes, set-ups and departure from good manufacturing practices (penalty function) are the optimization evaluation criteria. Precedence constraints from the geometric and manufacturing interactions between features and their related operations in a part are defined and classified according to their effects on the plan feasibility and processing quality.
It is well-documented how artificial intelligence can have (and already is having) a big impact on chemical engineering. But classical machine learning approaches may be weak for many chemical engineering applications. This review discusses how challenging data characteristics arise in chemical engineering applications. We identify four characteristics of data arising in chemical engineering applications that make applying classical artificial intelligence approaches difficult: (1) high variance, low volume data, (2) low variance, high volume data, (3) noisy/corrupt/missing data, and (4) restricted data with physics-based limitations. For each of these four data characteristics, we discuss applications where these data characteristics arise and show how current chemical engineering research is extending the fields of data science and machine learning to incorporate these challenges. Finally, we identify several challenges for future research.
The notion of reparametrizations of Weighted CSPs (WCSPs) (also known as equivalence-preserving transformations of WCSPs) is well-known and finds its use in many algorithms to approximate or bound the optimal WCSP value. In contrast, the concept of super-reparametrizations (which are changes of the weights that keep or increase the WCSP objective for every assignment) was already proposed but never studied in detail. To fill this gap, we present a number of theoretical properties of super-reparametrizations and compare them to those of reparametrizations. Furthermore, we propose a framework for computing upper bounds on the optimal value of the (maximization version of) WCSP using super-reparametrizations. We show that it is in principle possible to employ arbitrary (under some technical conditions) constraint propagation rules to improve the bound. For arc consistency in particular, the method reduces to the known Virtual AC (VAC) algorithm. Newly, we implemented the method for singleton arc consistency (SAC) and compared it to other strong local consistencies in WCSPs on a public benchmark. The results show that the bounds obtained from SAC are superior for many instance groups.
Machine learning models are widely used for real-world applications, such as document analysis and vision. Constrained machine learning problems are problems where learned models have to both be accurate and respect constraints. For continuous convex constraints, many works have been proposed, but learning under combinatorial constraints is still a hard problem. The goal of this paper is to broaden the modeling capacity of constrained machine learning problems by incorporating existing work from combinatorial optimization. We propose first a general framework called BaGeL (Branch, Generate and Learn) which applies Branch and Bound to constrained learning problems where a learning problem is generated and trained at each node until only valid models are obtained. Because machine learning has specific requirements, we also propose an extended table constraint to split the space of hypotheses.
Automated hyperparameter optimization (HPO) has gained great popularity and is an important ingredient of most automated machine learning frameworks. The process of designing HPO algorithms, however, is still an unsystematic and manual process: Limitations of prior work are identified and the improvements proposed are -- even though guided by expert knowledge -- still somewhat arbitrary. This rarely allows for gaining a holistic understanding of which algorithmic components are driving performance, and carries the risk of overlooking good algorithmic design choices. We present a principled approach to automated benchmark-driven algorithm design applied to multifidelity HPO (MF-HPO): First, we formalize a rich space of MF-HPO candidates that includes, but is not limited to common HPO algorithms, and then present a configurable framework covering this space. To find the best candidate automatically and systematically, we follow a programming-by-optimization approach and search over the space of algorithm candidates via Bayesian optimization. We challenge whether the found design choices are necessary or could be replaced by more naive and simpler ones by performing an ablation analysis. We observe that using a relatively simple configuration, in some ways simpler than established methods, performs very well as long as some critical configuration parameters have the right value.
--In the trial-and-error mechanism of reinforcement learning (RL), a notorious contradiction arises when we expect to learn a safe policy: how to learn a safe policy without enough data and prior model about the dangerous region? Existing methods mostly use the posterior penalty for dangerous actions, which means that the agent is not penalized until experiencing danger . This fact causes that the agent cannot learn a zero-violation policy even after convergence . Otherwise, it would not receive any penalty and lose the knowledge about danger . In this paper, we propose the safe set actor-critic (SSAC) algorithm, which confines the policy update using safety-oriented energy functions, or the safety indexes . The safety index is designed to increase rapidly for potentially dangerous actions, which allow us to locate the safe set on the action space, or the control safe set . Therefore, we can identify the dangerous actions prior to taking them, and further obtain a zero constraint-violation policy after convergence. We claim that we can learn the energy function in a model-free manner similar to learning a value function. By using the energy function transition as the constraint objective, we formulate a constrained RL problem. We prove that our Lagrangian-based solutions make sure that the learned policy will converge to the constrained optimum under some assumptions. The proposed algorithm is evaluated on both the complex simulation environments and a hardware-in-loop (HIL) experiment with a real controller from the autonomous vehicle. Experimental results suggest that the converged policy in all environments achieve zero constraint violation and comparable performance with model-based baseline. EINFORCEMENT learning has drawn rapidly growing attention for its superhuman learning capabilities in many sequential decision making problems like Go , Atari Games , and Starcraft .
By concisely representing a joint function of many variables as the combination of small functions, discrete graphical models (GMs) provide a powerful framework to analyze stochastic and deterministic systems of interacting variables. One of the main queries on such models is to identify the extremum of this joint function. This is known as the Weighted Constraint Satisfaction Problem (WCSP) on deterministic Cost Function Networks and as Maximum a Posteriori (MAP) inference on stochastic Markov Random Fields. Algorithms for approximate WCSP inference typically rely on local consistency algorithms or belief propagation. These methods are intimately related to linear programming (LP) relaxations and often coupled with reparametrizations defined by the dual solution of the associated LP. Since the seminal work of Goemans and Williamson, it is well understood that convex SDP relaxations can provide superior guarantees to LP. But the inherent computational cost of interior point methods has limited their application. The situation has improved with the introduction of non-convex Burer-Monteiro style methods which are well suited to handle the SDP relaxation of combinatorial problems with binary variables (such as MAXCUT, MaxSAT or MAP/Ising). We compute low rank SDP upper and lower bounds for discrete pairwise graphical models with arbitrary number of values and arbitrary binary cost functions by extending a Burer-Monteiro style method based on row-by-row updates. We consider a traditional dualized constraint approach and a dedicated Block Coordinate Descent approach which avoids introducing large penalty coefficients to the formulation. On increasingly hard and dense WCSP/CFN instances, we observe that the BCD approach can outperform the dualized approach and provide tighter bounds than local consistencies/convergent message passing approaches.
In most optimization problems, users have a clear understanding of the function to optimize (e.g., minimize the makespan for scheduling problems). However, the constraints may be difficult to state and their modelling often requires expertise in Constraint Programming. Active constraint acquisition has been successfully used to support non-experienced users in learning constraint networks through the generation of a sequence of queries. In this paper, we propose Learn&Optimize, a method to solve optimization problems with known objective function and unknown constraint network. It uses an active constraint acquisition algorithm which learns the unknown constraints and computes boundaries for the optimal solution during the learning process. As a result, our method allows users to solve optimization problems without learning the overall constraint network.
Prediction+optimization is a common real-world paradigm where we have to predict problem parameters before solving the optimization problem. However, the criteria by which the prediction model is trained are often inconsistent with the goal of the downstream optimization problem. Recently, decision-focused prediction approaches, such as SPO+ and direct optimization, have been proposed to fill this gap. However, they cannot directly handle the soft constraints with the $max$ operator required in many real-world objectives. This paper proposes a novel analytically differentiable surrogate objective framework for real-world linear and semi-definite negative quadratic programming problems with soft linear and non-negative hard constraints. This framework gives the theoretical bounds on constraints' multipliers, and derives the closed-form solution with respect to predictive parameters and thus gradients for any variable in the problem. We evaluate our method in three applications extended with soft constraints: synthetic linear programming, portfolio optimization, and resource provisioning, demonstrating that our method outperforms traditional two-staged methods and other decision-focused approaches.