Malitsky, Yuri
Deep Learning for Algorithm Portfolios
Loreggia, Andrea (University of Padova and IBM Research) | Malitsky, Yuri (IBM Research) | Samulowitz, Horst (IBM Research) | Saraswat, Vijay (IBM Research)
It is well established that in many scenarios there is no single solver that will provide optimal performance across a wide range of problem instances. Taking advantage of this observation, research into algorithm selection is designed to help identify the best approach for each problem at hand. This segregation is usually based on carefully constructed features, designed to quickly present the overall structure of the instance as a constant size numeric vector. Based on these features, a plethora of machine learning techniques can be utilized to predict the appropriate solver to execute, leading to significant improvements over relying solely on any one solver. However, being manually constructed, the creation of good features is an arduous task requiring a great deal of knowledge of the problem domain of interest. To alleviate this costly yet crucial step, this paper presents an automated methodology for producing an informative set of features utilizing a deep neural network. We show that the presented approach completely automates the algorithm selection pipeline and is able to achieve significantly better performance than a single best solver across multiple problem domains.
ASlib: A Benchmark Library for Algorithm Selection
Bischl, Bernd, Kerschke, Pascal, Kotthoff, Lars, Lindauer, Marius, Malitsky, Yuri, Frechette, Alexandre, Hoos, Holger, Hutter, Frank, Leyton-Brown, Kevin, Tierney, Kevin, Vanschoren, Joaquin
The task of algorithm selection involves choosing an algorithm from a set of algorithms on a per-instance basis in order to exploit the varying performance of algorithms over a set of instances. The algorithm selection problem is attracting increasing attention from researchers and practitioners in AI. Years of fruitful applications in a number of domains have resulted in a large amount of data, but the community lacks a standard format or repository for this data. This situation makes it difficult to share and compare different approaches effectively, as is done in other, more established fields. It also unnecessarily hinders new researchers who want to work in this area. To address this problem, we introduce a standardized format for representing algorithm selection scenarios and a repository that contains a growing number of data sets from the literature. Our format has been designed to be able to express a wide variety of different scenarios. Demonstrating the breadth and power of our platform, we describe a set of example experiments that build and evaluate algorithm selection models through a common interface. The results display the potential of algorithm selection to achieve significant performance improvements across a broad range of problems and algorithms.
Towards Cognitive Automation of Data Science
Biem, Alain (IBM Research) | Butrico, Maria (IBM Research) | Feblowitz, Mark (IBM Research) | Klinger, Tim (IBM Research) | Malitsky, Yuri (IBM Research) | Ng, Kenney (IBM Research) | Perer, Adam (IBM Research) | Reddy, Chandra (IBM Research) | Riabov, Anton (IBM Research) | Samulowitz, Horst (IBM Research) | Sow, Daby (IBM Research) | Tesauro, Gerald (IBM Research) | Turaga, Deepak (IBM Research)
A Data Scientist typically performs a number of tedious and time-consuming steps to derive insight from a raw data set. The process usually starts with data ingestion, cleaning, and transformation (e.g. outlier removal, missing value imputation), then proceeds to model building, and finally a presentation of predictions that align with the end-users objectives and preferences. It is a long, complex, and sometimes artful process requiring substantial time and effort, especially because of the combinatorial explosion in choices of algorithms (and platforms), their parameters, and their compositions. Tools that can help automate steps in this process have the potential to accelerate the time-to-delivery of useful results, expand the reach of data science to non-experts, and offer a more systematic exploration of the available options. This work presents a step towards this goal.
MaxSAT by Improved Instance-Specific Algorithm Configuration
Ansotegui, Carlos (University of Lleida) | Malitsky, Yuri (Insight Centre for Data Analytics) | Sellmann, Meinolf (IBM Watson Research Center)
We show how both techniques can be combined MaxSAT is the optimization version of the Satisfiability and empirically demonstrate on SAT that our improved (SAT) problem. It can be used effectively to model problems method works notably better than the original method and in several domains, such as scheduling, timetabling, other instance-specific algorithm tuners. We then apply the FPGA routing, design and circuit debugging, software package new technique to MaxSAT. Finally, in extensive experiments installation, bioinformatics, probabilistic reasoning, etc. we show that the developed solvers significantly outperform From the research perspective, MaxSAT is also of particular the current state-of-the-art in every MaxSAT domain.
Proteus: A Hierarchical Portfolio of Solvers and Transformations
Hurley, Barry, Kotthoff, Lars, Malitsky, Yuri, O'Sullivan, Barry
In recent years, portfolio approaches to solving SAT problems and CSPs have become increasingly common. There are also a number of different encodings for representing CSPs as SAT instances. In this paper, we leverage advances in both SAT and CSP solving to present a novel hierarchical portfolio-based approach to CSP solving, which we call Proteus, that does not rely purely on CSP solvers. Instead, it may decide that it is best to encode a CSP problem instance into SAT, selecting an appropriate encoding and a corresponding SAT solver. Our experimental evaluation used an instance of Proteus that involved four CSP solvers, three SAT encodings, and six SAT solvers, evaluated on the most challenging problem instances from the CSP solver competitions, involving global and intensional constraints. We show that significant performance improvements can be achieved by Proteus obtained by exploiting alternative view-points and solvers for combinatorial problem-solving.
Non-Model-Based Search Guidance for Set Partitioning Problems
Kadioglu, Serdar (Brown University) | Malitsky, Yuri (Brown University) | Sellmann, Meinolf (IBM Research)
Instead, we cluster Search is an integral part of solution approaches for NPhard training instances according to their features and determine combinatorial optimization and decision problems. Once the an assignment of branching heuristics to clusters that results ability to reason deterministically is exhausted, state-of-theart in the best performance when the branching heuristic is dynamically solvers try out different alternatives which may lead to chosen based on the current subproblem's nearest an improved (in case of optimization) or feasible (in case cluster. We test our approach on the MIP-solver Cplex that of satisfaction) solution. This consideration of alternatives we use to tackle set partitioning problems. Our experiments may take place highly opportunistically as in local search approaches, show that this approach can effectively boost search performance or systematically as in backtracking-based methods.
Filtering Bounded Knapsack Constraints in Expected Sublinear Time
Malitsky, Yuri (Brown University) | Sellmann, Meinolf (Brown University) | Szymanek, Radoslaw (Ecole Polytechnique Federale de Lausanne)
We present a highly efficient incremental algorithm for propagating bounded knapsack constraints. Our algorithm is based on the sublinear filtering algorithm for binary knapsack constraints by Katriel et al. and achieves similar speed-ups of one to two orders of magnitude when compared with its linear-time counterpart. We also show that the representation of bounded knapsacks as binary knapsacks leads to ineffective filtering behavior. Experiments on standard knapsack benchmarks show that the new algorithm significantly outperforms existing methods for handling bounded knapsack constraints.