Search
Learning Mixed-Integer Linear Programs from Contextual Examples
Kumar, Mohit, Kolb, Samuel, De Raedt, Luc, Teso, Stefano
Mixed-integer linear programs (MILPs) are widely used in artificial intelligence and operations research to model complex decision problems like scheduling and routing. Designing such programs however requires both domain and modelling expertise. In this paper, we study the problem of acquiring MILPs from contextual examples, a novel and realistic setting in which examples capture solutions and non-solutions within a specific context. The resulting learning problem involves acquiring continuous parameters -- namely, a cost vector and a feasibility polytope -- but has a distinctly combinatorial flavor. To solve this complex problem, we also contribute MISSLE, an algorithm for learning MILPs from contextual examples. MISSLE uses a variant of stochastic local search that is guided by the gradient of a continuous surrogate loss function. Our empirical evaluation on synthetic data shows that MISSLE acquires better MILPs faster than alternatives based on stochastic local search and gradient descent.
Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges
Bischl, Bernd, Binder, Martin, Lang, Michel, Pielok, Tobias, Richter, Jakob, Coors, Stefan, Thomas, Janek, Ullmann, Theresa, Becker, Marc, Boulesteix, Anne-Laure, Deng, Difan, Lindauer, Marius
Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.
Efficient Set of Vectors Search
Leybovich, Michael, Shmueli, Oded
We consider a similarity measure between two sets $A$ and $B$ of vectors, that balances the average and maximum cosine distance between pairs of vectors, one from set $A$ and one from set $B$. As a motivation for this measure, we present lineage tracking in a database. To practically realize this measure, we need an approximate search algorithm that given a set of vectors $A$ and sets of vectors $B_1,...,B_n$, the algorithm quickly locates the set $B_i$ that maximizes the similarity measure. For the case where all sets are singleton sets, essentially each is a single vector, there are known efficient approximate search algorithms, e.g., approximated versions of tree search algorithms, locality-sensitive hashing (LSH), vector quantization (VQ) and proximity graph algorithms. In this work, we present approximate search algorithms for the general case. The underlying idea in these algorithms is encoding a set of vectors via a "long" single vector.
Implementing Custom GridSearchCV and RandomSearchCV without scikit-learn
Grid Search can be thought of as an exhaustive search for selecting a model. In Grid Search, the data scientist sets up a grid of hyperparameter values and for each combination, trains a model and scores on the testing data. In this approach, every combination of hyperparameter values is tried which can be very inefficient. For example, searching 20 different parameter values for each of 4 parameters will require 160,000 trials of cross-validation. This equates to 1,600,000 model fits and 1,600,000 predictions if 10-fold cross validation is used.
Designing Machine Learning Pipeline Toolkit for AutoML Surrogate Modeling Optimization
Palmes, Paulito P., Kishimoto, Akihiro, Marinescu, Radu, Ram, Parikshit, Daly, Elizabeth
The pipeline optimization problem in machine learning requires simultaneous optimization of pipeline structures and parameter adaptation of their elements. Having an elegant way to express these structures can help lessen the complexity in the management and analysis of their performances together with the different choices of optimization strategies. With these issues in mind, we created the AutoMLPipeline (AMLP) toolkit which facilitates the creation and evaluation of complex machine learning pipeline structures using simple expressions. We use AMLP to find optimal pipeline signatures, datamine them, and use these datamined features to speed-up learning and prediction. We formulated a two-stage pipeline optimization with surrogate modeling in AMLP which outperforms other AutoML approaches with a 4-hour time budget in less than 5 minutes of AMLP computation time.
HANT: Hardware-Aware Network Transformation
Molchanov, Pavlo, Hall, Jimmy, Yin, Hongxu, Kautz, Jan, Fusi, Nicolo, Vahdat, Arash
Given a trained network, how can we accelerate it to meet efficiency needs for deployment on particular hardware? The commonly used hardware-aware network compression techniques address this question with pruning, kernel fusion, quantization and lowering precision. However, these approaches do not change the underlying network operations. In this paper, we propose hardware-aware network transformation (HANT), which accelerates a network by replacing inefficient operations with more efficient alternatives using a neural architecture search like approach. HANT tackles the problem in two phase: In the first phase, a large number of alternative operations per every layer of the teacher model is trained using layer-wise feature map distillation. In the second phase, the combinatorial selection of efficient operations is relaxed to an integer optimization problem that can be solved in a few seconds. We extend HANT with kernel fusion and quantization to improve throughput even further. Our experimental results on accelerating the EfficientNet family show that HANT can accelerate them by up to 3.6x with <0.4% drop in the top-1 accuracy on the ImageNet dataset. When comparing the same latency level, HANT can accelerate EfficientNet-B4 to the same latency as EfficientNet-B1 while having 3% higher accuracy. We examine a large pool of operations, up to 197 per layer, and we provide insights into the selected operations and final architectures.
Reinforced Hybrid Genetic Algorithm for the Traveling Salesman Problem
Zheng, Jiongzhi, Chen, Menglei, Zhong, Jialun, He, Kun
Given a set of cities with certain locations, the Traveling Salesman Problem (TSP) is to find the shortest Hamiltonian route, along which a salesman travels from a city to visit all the cities exactly once and finally returns to the starting city. The TSP is one of the most famous and well-studied NP-hard combinatorial optimization problems, which is very easy to understand but very difficult to solve optimally or near-optimally. Over the years, TSP has become a touchstone for the algorithm design. Typical methods for solving the TSP are mainly exact algorithms, approximation algorithms and heuristics. The exact algorithms may be prohibitive for large instances and the approximation algorithms may suffer from weak optimal guarantees or empirical performance (Khalil et al. 2017). Heuristics are known to be the most efficient and effective approaches for solving the TSP.
Learning to Delegate for Large-scale Vehicle Routing
Li, Sirui, Yan, Zhongxia, Wu, Cathy
Vehicle routing problems (VRPs) are a class of combinatorial problems with wide practical applications. While previous heuristic or learning-based works achieve decent solutions on small problem instances of up to 100 customers, their performance does not scale to large problems. This article presents a novel learning-augmented local search algorithm to solve large-scale VRP. The method iteratively improves the solution by identifying appropriate subproblems and $\textit{delegating}$ their improvement to a black box subsolver. At each step, we leverage spatial locality to consider only a linear number of subproblems, rather than exponential. We frame subproblem selection as a regression problem and train a Transformer on a generated training set of problem instances. We show that our method achieves state-of-the-art performance, with a speed-up of up to 15 times over strong baselines, on VRPs with sizes ranging from 500 to 3000.
YouTube Algorithm Directs Viewers to False, Sexualized Videos, Study Finds
YouTube has instituted many changes over the past year to limit the problematic videos it recommends to viewers. A new study suggests the repairs have a way to go. Software nonprofit Mozilla Foundation found that YouTube's powerful recommendation engine continues to direct viewers to videos that they say showed false claims and sexualized content, with the platform's algorithms suggesting 71% of the videos that participants found objectionable. The study highlights the continuing challenge Alphabet Inc. subsidiary YouTube faces as it tries to police the user-generated content that turned it into the world's leading video service. It is emblematic of the struggle roiling platforms from Facebook Inc. to Twitter Inc., which soared to prominence by encouraging people to share information but which now face regulatory and social pressure to police divisive, misleading and dangerous content without censoring diverse points of view. For YouTube, it also shows gaps in its efforts to steer users to videos that should be of interest based on viewership patterns, as opposed to those that are going viral for other reasons.
How AI Revolutionised the Ancient Game of Chess
I have come to the personal conclusion that while all artists are not chess players, all chess players are artists. Originally called Chaturanga, the game was set on an 8x8 Ashtāpada board and shared two key fundamental features that still distinguish the game today. Different pieces subject to different rules of movement and the presence of a single king piece whose fate determines the outcome. But it was not until the 15th century, with the introduction of the queen piece and the popularization of various other rules, that we saw the game develop into the form we know today. The emergence of international chess competition in the late 19th century meant that the game took on a new geopolitical importance.