Optimization
A metaheuristic for crew scheduling in a pickup-and-delivery problem with time windows
Lucci, Mauro, Severín, Daniel, Zabala, Paula
A vehicle routing and crew scheduling problem (VRCSP) consists of simultaneously planning the routes of a fleet of vehicles and scheduling the crews, where the vehicle-crew correspondence is not fixed through time. This allows a greater planning flexibility and a more efficient use of the fleet, but in counterpart, a high synchronisation is demanded. In this work, we present a VRCSP where pickup-and-delivery requests with time windows have to be fulfilled over a given planning horizon by using trucks and drivers. Crews can be composed of 1 or 2 drivers and any of them can be relieved in a given set of locations. Moreover, they are allowed to travel among locations with non-company shuttles, at an additional cost that is minimised. As our problem considers distinct routes for trucks and drivers, we have an additional flexibility not contemplated in other previous VRCSP given in the literature where a crew is handled as an indivisible unit. We tackle this problem with a two-stage sequential approach: a set of truck routes is computed in the first stage and a set of driver routes consistent with the truck routes is obtained in the second one. We design and evaluate the performance of a metaheuristic based algorithm for the latter stage. Our algorithm is mainly a GRASP with a perturbation procedure that allows reusing solutions already found in case the search for new solutions becomes difficult. This procedure together with other to repair infeasible solutions allow us to find high-quality solutions on instances of 100 requests spread across 15 cities with a fleet of 12-32 trucks (depending on the planning horizon) in less than an hour. We also conclude that the possibility of carrying an additional driver leads to a decrease of the cost of external shuttles by about 60% on average with respect to individual crews and, in some cases, to remove this cost completely.
Towards Explainable Exploratory Landscape Analysis: Extreme Feature Selection for Classifying BBOB Functions
Renau, Quentin, Dreo, Johann, Doerr, Carola, Doerr, Benjamin
Facilitated by the recent advances of Machine Learning (ML), the automated design of optimization heuristics is currently shaking up evolutionary computation (EC). Where the design of hand-picked guidelines for choosing a most suitable heuristic has long dominated research activities in the field, automatically trained heuristics are now seen to outperform human-derived choices even for well-researched optimization tasks. ML-based EC is therefore not any more a futuristic vision, but has become an integral part of our community. A key criticism that ML-based heuristics are often faced with is their potential lack of explainability, which may hinder future developments. This applies in particular to supervised learning techniques which extrapolate algorithms' performance based on exploratory landscape analysis (ELA). In such applications, it is not uncommon to use dozens of problem features to build the models underlying the specific algorithm selection or configuration task. Our goal in this work is to analyze whether this many features are indeed needed. Using the classification of the BBOB test functions as testbed, we show that a surprisingly small number of features -- often less than four -- can suffice to achieve a 98\% accuracy. Interestingly, the number of features required to meet this threshold is found to decrease with the problem dimension. We show that the classification accuracy transfers to settings in which several instances are involved in training and testing. In the leave-one-instance-out setting, however, classification accuracy drops significantly, and the transformation-invariance of the features becomes a decisive success factor.
Fairness through Optimization
Chen, Violet Xinying, Hooker, J. N.
We propose optimization as a general paradigm for formalizing fairness in AI-based decision models. We argue that optimization models allow formulation of a wide range of fairness criteria as social welfare functions, while enabling AI to take advantage of highly advanced solution technology. We show how optimization models can assist fairness-oriented decision making in the context of neural networks, support vector machines, and rule-based systems by maximizing a social welfare function subject to appropriate constraints. In particular, we state tractable optimization models for a variety of functions that measure fairness or a combination of fairness and efficiency. These include several inequality metrics, Rawlsian criteria, the McLoone and Hoover indices, alpha fairness, the Nash and Kalai-Smorodinsky bargaining solutions, combinations of Rawlsian and utilitarian criteria, and statistical bias measures. All of these models can be efficiently solved by linear programming, mixed integer/linear programming, or (in two cases) specialized convex programming methods.
SGD Generalizes Better Than GD (And Regularization Doesn't Help)
Amir, Idan, Koren, Tomer, Livni, Roi
We give a new separation result between the generalization performance of stochastic gradient descent (SGD) and of full-batch gradient descent (GD) in the fundamental stochastic convex optimization model. While for SGD it is well-known that $O(1/\epsilon^2)$ iterations suffice for obtaining a solution with $\epsilon$ excess expected risk, we show that with the same number of steps GD may overfit and emit a solution with $\Omega(1)$ generalization error. Moreover, we show that in fact $\Omega(1/\epsilon^4)$ iterations are necessary for GD to match the generalization performance of SGD, which is also tight due to recent work by Bassily et al. (2020). We further discuss how regularizing the empirical risk minimized by GD essentially does not change the above result, and revisit the concepts of stability, implicit bias and the role of the learning algorithm in generalization.
Riemannian Perspective on Matrix Factorization
Matrix completion is a classical problem in machine learning and signal processing that aims to recover an unknown low-rank matrix from only a few observed entries. Ever since the pioneering work by Candès and Recht (2009), there have been a flurry of works solving matrix completion with guarantees. See a survey by Candès and Recht (2012) and the introduction of (Ge et al., 2016) for detailed information. Among many approaches, one prominent approach widely used in practice is based on matrix factorizations, à la Burer and Monteiro (2003).
GraphDF: A Discrete Flow Model for Molecular Graph Generation
Luo, Youzhi, Yan, Keqiang, Ji, Shuiwang
We consider the problem of molecular graph generation using deep models. While graphs are discrete, most existing methods use continuous latent variables, resulting in inaccurate modeling of discrete graph structures. In this work, we propose GraphDF, a novel discrete latent variable model for molecular graph generation based on normalizing flow methods. GraphDF uses invertible modulo shift transforms to map discrete latent variables to graph nodes and edges. We show that the use of discrete latent variables reduces computational costs and eliminates the negative effect of dequantization. Comprehensive experimental results show that GraphDF outperforms prior methods on random generation, property optimization, and constrained optimization tasks.
Parameter-free Stochastic Optimization of Variationally Coherent Functions
Orabona, Francesco, Pál, Dávid
We design and analyze an algorithm for first-order stochastic optimization of a large class of functions on $\mathbb{R}^d$. In particular, we consider the \emph{variationally coherent} functions which can be convex or non-convex. The iterates of our algorithm on variationally coherent functions converge almost surely to the global minimizer $\boldsymbol{x}^*$. Additionally, the very same algorithm with the same hyperparameters, after $T$ iterations guarantees on convex functions that the expected suboptimality gap is bounded by $\widetilde{O}(\|\boldsymbol{x}^* - \boldsymbol{x}_0\| T^{-1/2+\epsilon})$ for any $\epsilon>0$. It is the first algorithm to achieve both these properties at the same time. Also, the rate for convex functions essentially matches the performance of parameter-free algorithms. Our algorithm is an instance of the Follow The Regularized Leader algorithm with the added twist of using \emph{rescaled gradients} and time-varying linearithmic regularizers.
Epistocracy Algorithm: A Novel Hyper-heuristic Optimization Strategy for Solving Complex Optimization Problems
Mojab, Seyed Ziae Mousavi, Shams, Seyedmohammad, Soltanian-Zadeh, Hamid, Fotouhi, Farshad
This paper proposes a novel evolutionary algorithm called Epistocracy which incorporates human socio-political behavior and intelligence to solve complex optimization problems. The inspiration of the Epistocracy algorithm originates from a political regime where educated people have more voting power than the uneducated or less educated. The algorithm is a self-adaptive, and multi-population optimizer in which the evolution process takes place in parallel for many populations led by a council of leaders. To avoid stagnation in poor local optima and to prevent a premature convergence, the algorithm employs multiple mechanisms such as dynamic and adaptive leadership based on gravitational force, dynamic population allocation and diversification, variance-based step-size determination, and regression-based leadership adjustment. The algorithm uses a stratified sampling method called Latin Hypercube Sampling (LHS) to distribute the initial population more evenly for exploration of the search space and exploitation of the accumulated knowledge. To investigate the performance and evaluate the reliability of the algorithm, we have used a set of multimodal benchmark functions, and then applied the algorithm to the MNIST dataset to further verify the accuracy, scalability, and robustness of the algorithm. Experimental results show that the Epistocracy algorithm outperforms the tested state-of-the-art evolutionary and swarm intelligence algorithms in terms of performance, precision, and convergence.
Beyond traditional assumptions in fair machine learning
After challenging the validity of these assumptions in real-world applications, we propose ways to move forward when they are violated. First, we show that group fairness criteria purely based on statistical properties of observed data are fundamentally limited. Revisiting this limitation from a causal viewpoint we develop a more versatile conceptual framework, causal fairness criteria, and first algorithms to achieve them. We also provide tools to analyze how sensitive a believed-to-be causally fair algorithm is to misspecifications of the causal graph. Second, we overcome the assumption that sensitive data is readily available in practice. To this end we devise protocols based on secure multi-party computation to train, validate, and contest fair decision algorithms without requiring users to disclose their sensitive data or decision makers to disclose their models. Finally, we also accommodate the fact that outcome labels are often only observed when a certain decision has been made. We suggest a paradigm shift away from training predictive models towards directly learning decisions to relax the traditional assumption that labels can always be recorded. The main contribution of this thesis is the development of theoretically substantiated and practically feasible methods to move research on fair machine learning closer to real-world applications.
Benchmark and Survey of Automated Machine Learning Frameworks
Zöller, Marc-André (USU Software AG) | Huber, Marco F. (University of Stuttgart and Fraunhofer IPA)
Machine learning (ML) has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to build machine learning applications automatically without extensive knowledge of statistics and machine learning. This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets. Driven by the selected frameworks for evaluation, we summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline. The selected AutoML frameworks are evaluated on 137 data sets from established AutoML benchmark suites.