Constraint-Based Reasoning
On the Complexity of Breaking Symmetry
We can break symmetry by eliminating solutions within a symmetry class that are not least in the lexicographical ordering. This is often referred to as the lex-leader method. Unfortunately, as symmetry groups can be large, the lexleader method is not tractable in general. We prove that using other total orderings besides the usual lexicographical ordering will not reduce the computational complexity of breaking symmetry in general. It follows that breaking symmetry with other orderings like the Gray code ordering or the Snake-Lex ordering is intractable in general.
Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures
Marco, Alonso, von Rohr, Alexander, Baumann, Dominik, Hernández-Lobato, José Miguel, Trimpe, Sebastian
When learning to ride a bike, a child falls down a number of times before achieving the first success. As falling down usually has only mild consequences, it can be seen as a tolerable failure in exchange for a faster learning process, as it provides rich information about an undesired behavior. In the context of Bayesian optimization under unknown constraints (BOC), typical strategies for safe learning explore conservatively and avoid failures by all means. On the other side of the spectrum, non conservative BOC algorithms that allow failing may fail an unbounded number of times before reaching the optimum. In this work, we propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures. Empirical validation shows that our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods. In addition, we propose an original algorithm for unconstrained Bayesian optimization inspired by the notion of excursion sets in stochastic processes, upon which the failures-aware algorithm is built.
Thompson Sampling for Linearly Constrained Bandits
Saxena, Vidit, Gonzalez, Joseph E., Jaldén, Joakim
We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under a probabilistic linear constraint. For a few real-world instances of this problem, constrained extensions of the well-known Thompson Sampling (TS) heuristic have recently been proposed. However, finite-time analysis of constrained TS is challenging; as a result, only O(\sqrt{T}) bounds on the cumulative reward loss (i.e., the regret) are available. In this paper, we describe LinConTS, a TS-based algorithm for bandits that place a linear constraint on the probability of earning a reward in every round. We show that for LinConTS, the regret as well as the cumulative constraint violations are upper bounded by O(\log T) for the suboptimal arms. We develop a proof technique that relies on careful analysis of the dual problem and combine it with recent theoretical work on unconstrained TS. Through numerical experiments on two real-world datasets, we demonstrate that LinConTS outperforms an asymptotically optimal upper confidence bound (UCB) scheme in terms of simultaneously minimizing the regret and the violation.
Encoding Linear Constraints into SAT
Abío, Ignasi, Mayer-Eichberger, Valentin, Stuckey, Peter
Linear integer constraints are one of the most important constraints in combinatorial problems since they are commonly found in many practical applications. Typically, encodings to Boolean satisfiability (SAT) format of conjunctive normal form perform poorly in problems with these constraints in comparison with SAT modulo theories (SMT), lazy clause generation (LCG) or mixed integer programming (MIP) solvers. In this paper we explore and categorize SAT encodings for linear integer constraints. We define new SAT encodings based on multi-valued decision diagrams, and sorting networks. We compare different SAT encodings of linear constraints and demonstrate where one may be preferable to another. We also compare SAT encodings against other solving methods and show they can be better than linear integer (MIP) solvers and sometimes better than LCG or SMT solvers on appropriate problems. Combining the new encoding with lazy decomposition, which during runtime only encodes constraints that are important to the solving process that occurs, gives the best option for many highly combinatorial problems involving linear constraints.
Using Self-Organizing Maps to solve the Traveling Salesman Problem
The Traveling Salesman Problem is a well known challenge in Computer Science: it consists on finding the shortest route possible that traverses all cities in a given map only once. Although its simple explanation, this problem is, indeed, NP-Complete. This implies that the difficulty to solve it increases rapidly with the number of cities, and we do not know in fact a general solution that solves the problem. For that reason, we currently consider that any method able to find a sub-optimal solution is generally good enough (we cannot verify if the solution returned is the optimal one most of the times). To solve it, we can try to apply a modification of the Self-Organizing Map (SOM) technique.
Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation
Yang, Mengyue, Li, Qingyang, Qin, Zhiwei, Ye, Jieping
Contextual multi-armed bandit (MAB) achieves cutting-edge performance on a variety of problems. When it comes to real-world scenarios such as recommendation system and online advertising, however, it is essential to consider the resource consumption of exploration. In practice, there is typically non-zero cost associated with executing a recommendation (arm) in the environment, and hence, the policy should be learned with a fixed exploration cost constraint. It is challenging to learn a global optimal policy directly, since it is a NP-hard problem and significantly complicates the exploration and exploitation trade-off of bandit algorithms. Existing approaches focus on solving the problems by adopting the greedy policy which estimates the expected rewards and costs and uses a greedy selection based on each arm's expected reward/cost ratio using historical observation until the exploration resource is exhausted. However, existing methods are hard to extend to infinite time horizon, since the learning process will be terminated when there is no more resource. In this paper, we propose a hierarchical adaptive contextual bandit method (HATCH) to conduct the policy learning of contextual bandits with a budget constraint. HATCH adopts an adaptive method to allocate the exploration resource based on the remaining resource/time and the estimation of reward distribution among different user contexts. In addition, we utilize full of contextual feature information to find the best personalized recommendation. Finally, in order to prove the theoretical guarantee, we present a regret bound analysis and prove that HATCH achieves a regret bound as low as $O(\sqrt{T})$. The experimental results demonstrate the effectiveness and efficiency of the proposed method on both synthetic data sets and the real-world applications.
Where you should drop Deep Learning in favor of Constraint Solvers
Machine Learning and Deep Learning are ongoing buzzwords in the industry. Branding ahead of functionalities led to Deep Learning being overused in many artificial intelligence applications. This post will provide a quick grasp at constraint satisfaction, a powerful yet underused approach which can tackle a large number of problems in AI and other areas of computer science, from logistics and scheduling to temporal reasoning and graph problems. Let's consider a factual and highly topical problem. Hospitals must organize quickly to treat ill people.
Hybrid Classification and Reasoning for Image-based Constraint Solving
Mulamba, Maxime, Mandi, Jayanta, Canoy, Rocsildes, Guns, Tias
There is an increased interest in solving complex constrained problems where part of the input is not given as facts but received as raw sensor data such as images or speech. We will use "visual sudoku" as a prototype problem, where the given cell digits are handwritten and provided as an image thereof. In this case, one first has to train and use a classifier to label the images, so that the labels can be used for solving the problem. In this paper, we explore the hybridization of classifying the images with the reasoning of a constraint solver. We show that pure constraint reasoning on predictions does not give satisfactory results. Instead, we explore the possibilities of a tighter integration, by exposing the probabilistic estimates of the classifier to the constraint solver. This allows joint inference on these probabilistic estimates, where we use the solver to find the maximum likelihood solution. We explore the trade-off between the power of the classifier and the power of the constraint reasoning, as well as further integration through the additional use of structural knowledge. Furthermore, we investigate the effect of calibration of the probabilistic estimates on the reasoning. Our results show that such hybrid approaches vastly outperform a separate approach, which encourages a further integration of prediction (probabilities) and constraint solving.
Generalized Nested Rollout Policy Adaptation
Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to generalize NRPA with a temperature and a bias and to analyze theoretically the algorithms. The generalized algorithm is named GNRPA. Experiments show it improves on NRPA for different application domains: SameGame and the Traveling Salesman Problem with Time Windows.
Intel debuts Pohoiki Springs, a powerful neuromorphic research system for AI workloads
This morning, Intel announced the general readiness of Pohoiki Springs, a powerful self-contained neuromorphic system that's about the size of five standard servers. The company says the system will be available to members of the Intel Neuromorphic Research Community via the cloud using Intel's Nx SDK and community-contributed software components, giving them a tool to scale up their neuromorphic research and explore ways to accelerate workloads that run slowly on today's conventional architectures. Intel claims Pohoiki Springs, which was announced in July 2019, is similar in neural capacity to the brain of a small mammal, with 768 Loihi chips and 100 million neurons spread across 24 Arria10 FPGA Nahuku expansion boards (containing 32 chips each) that operate at under 500 watts. This is ostensibly a step on the path to supporting larger and more sophisticated neuromorphic workloads. In fact, just this week, Intel demonstrated that the chips can be used to "teach" an AI model to distinguish among 10 different scents. "Pohoiki Springs enables our research partners to explore ways to accelerate workloads that run slowly today on conventional architectures, including high-performance computing systems," said Intel neuromorphic compute lab director Mike Davies in a statement.