Optimization
Optimal Task Assignment to Heterogeneous Federated Learning Devices
Federated Learning provides new opportunities for training machine learning models while respecting data privacy. This technique is based on heterogeneous devices that work together to iteratively train a model while never sharing their own data. Given the synchronous nature of this training, the performance of Federated Learning systems is dictated by the slowest devices, also known as stragglers. In this paper, we investigate the problem of minimizing the duration of Federated Learning rounds by controlling how much data each device uses for training. We formulate this problem as a makespan minimization problem with identical, independent, and atomic tasks that have to be assigned to heterogeneous resources with non-decreasing cost functions while respecting lower and upper limits of tasks per resource. Based on this formulation, we propose a polynomial-time algorithm named OLAR and prove that it provides optimal schedules. We evaluate OLAR in an extensive experimental evaluation using simulation that includes comparisons to other algorithms from the state of the art and new extensions to them. Our results indicate that OLAR provides optimal solutions with a small execution time. They also show that the presence of lower and upper limits of tasks per resource erase any benefits that suboptimal heuristics could provide in terms of algorithm execution time.
MARS-Gym: A Gym framework to model, train, and evaluate Recommender Systems for Marketplaces
Santana, Marlesson R. O., Melo, Luckeciano C., Camargo, Fernando H. F., Brandão, Bruno, Soares, Anderson, Oliveira, Renan M., Caetano, Sandor
Recommender Systems are especially challenging for marketplaces since they must maximize user satisfaction while maintaining the healthiness and fairness of such ecosystems. In this context, we observed a lack of resources to design, train, and evaluate agents that learn by interacting within these environments. For this matter, we propose MARS-Gym, an open-source framework to empower researchers and engineers to quickly build and evaluate Reinforcement Learning agents for recommendations in marketplaces. MARS-Gym addresses the whole development pipeline: data processing, model design and optimization, and multi-sided evaluation. We also provide the implementation of a diverse set of baseline agents, with a metrics-driven analysis of them in the Trivago marketplace dataset, to illustrate how to conduct a holistic assessment using the available metrics of recommendation, off-policy estimation, and fairness. With MARS-Gym, we expect to bridge the gap between academic research and production systems, as well as to facilitate the design of new algorithms and applications.
Wasserstein Distributionally Robust Inverse Multiobjective Optimization
Inverse multiobjective optimization provides a general framework for the unsupervised learning task of inferring parameters of a multiobjective decision making problem (DMP), based on a set of observed decisions from the human experts. However, the performance of this framework relies critically on the selection of appropriate decision making structure, a set of observed decisions that are sufficient and of high qualities, and a parameter space that contains enough information about the DMP. To hedge against the uncertainties in the hypothetical DMP, the data, and the parameter space, we investigate in this paper the distributionally robust approach for inverse multiobjective optimization. Specifically, we leverage the Wasserstein metric to construct a ball centering at the empirical distribution of these decisions. We then formulate a Wasserstein distributionally robust inverse multiobjective optimization problem (WRO-IMOP) that minimizes a worst-case expected loss function, where the worst case is taken over all distributions in the Wasserstein ball. We show that the excess risk of the WRO-IMOP estimator has a sub-linear convergence rate. Furthermore, we propose a semi-infinite reformulations of the WRO-IMOP and develop a cutting-plane algorithm that converges to any δ-optimal solution in finite iterations. Finally, we demonstrate the effectiveness of our method on both a synthetic multiobjective quadratic program and a real world portfolio optimization problem.
Online Convex Optimization in Changing Environments and its Application to Resource Allocation
In the era of the big data, we create and collect lots of data from all different kinds of sources: the Internet, the sensors, the consumer market, and so on. Many of the data are coming sequentially, and would like to be processed and understood quickly. One classic way of analyzing data is based on batch processing, in which the data is stored and analyzed in an offline fashion. However, when the volume of the data is too large, it is much more difficult and time-consuming to do batch processing than sequential processing. What's more, sequential data is usually changing dynamically, and needs to be understood on-the-fly in order to capture the changes. Online Convex Optimization (OCO) is a popular framework that matches the above sequential data processing requirement. Applications using OCO include online routing, online auctions, online classification and regression, as well as online resource allocation. Due to the general applicability of OCO to the sequential data and the rigorous theoretical guarantee, it has attracted lots of researchers to develop useful algorithms to fulfill different needs. In this thesis, we show our contributions to OCO's development by designing algorithms to adapt to changing environments.
Adversarial Training and Provable Robustness: A Tale of Two Objectives
We propose a principled framework that combines adversarial training and provable robustness verification for training certifiably robust neural networks. We formulate the training problem as a joint optimization problem with both empirical and provable robustness objectives and develop a novel gradient-descent technique that can eliminate bias in stochastic multi-gradients. We perform both theoretical analysis on the convergence of the proposed technique and experimental comparison with state-of-the-arts. Results on MNIST and CIFAR-10 show that our method can consistently match or outperform prior approaches for provable l infinity robustness. Notably, we achieve 6.60% verified test error on MNIST at epsilon = 0.3, and 66.57% on CIFAR-10 with epsilon = 8/255.
Filling a theatre in times of corona
Blom, Danny, Pendavingh, Rudi, Spieksma, Frits C. R.
Submitted to INFORMS Journal on Applied Analytics manuscript (Please, provide the manuscript number!) Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However, use of a template does not certify that the paper has been accepted for publication in the named journal. INFORMS journal templates are for the exclusive purpose of submitting to an INFORMS journal and should not be used to distribute the papers in print or online or to submit the papers to another publication. In this paper, we introduce an optimization problem posed by the Music Building Eindhoven (MBE) to deal with the economical consequences of the COVID-19 pandemic for theatre halls. We propose a model for maximizing the number of guests in a theatre hall that respects social distancing rules, and is based on trapezoid packings. Computational results show that up to 40% of the normal capacity can be used for a single show setting, and up to 70 % in case artists opt for two consecutive performances per evening. All around the world, the corona-crisis has hit the cultural sector hard. Festivals are cancelled, orchestra's are at the brink of bankruptcy, choirs have stopped performing, and theatres are struggling to survive.
Machine Learning in Airline Crew Pairing to Construct Initial Clusters for Dynamic Constraint Aggregation
Yaakoubi, Yassine, Soumis, François, Lacoste-Julien, Simon
The crew pairing problem (CPP) is generally modelled as a set partitioning problem where the flights have to be partitioned in pairings. A pairing is a sequence of flight legs separated by connection time and rest periods that starts and ends at the same base. Because of the extensive list of complex rules and regulations, determining whether a sequence of flights constitutes a feasible pairing can be quite difficult by itself, making CPP one of the hardest of the airline planning problems. In this paper, we first propose to improve the prototype Baseline solver of Desaulniers et al. (2020) by adding dynamic control strategies to obtain an efficient solver for large-scale CPPs: Commercial-GENCOL-DCA. These solvers are designed to aggregate the flights covering constraints to reduce the size of the problem. Then, we use machine learning (ML) to produce clusters of flights having a high probability of being performed consecutively by the same crew. The solver combines several advanced Operations Research techniques to assemble and modify these clusters, when necessary, to produce a good solution. We show, on monthly CPPs with up to 50 000 flights, that Commercial-GENCOL-DCA with clusters produced by ML-based heuristics outperforms Baseline fed by initial clusters that are pairings of a solution obtained by rolling horizon with GENCOL. The reduction of solution cost averages between 6.8% and 8.52%, which is mainly due to the reduction in the cost of global constraints between 69.79% and 78.11%.
A Coevolutionary Variable Neighborhood Search Algorithm for Discrete Multitasking (CoVNS): Application to Community Detection over Graphs
Osaba, Eneko, Villar-Rodriguez, Esther, Del Ser, Javier
The main goal of the multitasking optimization paradigm is to solve multiple and concurrent optimization tasks in a simultaneous way through a single search process. For attaining promising results, potential complementarities and synergies between tasks are properly exploited, helping each other by virtue of the exchange of genetic material. This paper is focused on Evolutionary Multitasking, which is a perspective for dealing with multitasking optimization scenarios by embracing concepts from Evolutionary Computation. This work contributes to this field by presenting a new multitasking approach named as Coevolutionary Variable Neighborhood Search Algorithm, which finds its inspiration on both the Variable Neighborhood Search metaheuristic and coevolutionary strategies. The second contribution of this paper is the application field, which is the optimal partitioning of graph instances whose connections among nodes are directed and weighted. This paper pioneers on the simultaneous solving of this kind of tasks. Two different multitasking scenarios are considered, each comprising 11 graph instances. Results obtained by our method are compared to those issued by a parallel Variable Neighborhood Search and independent executions of the basic Variable Neighborhood Search. The discussion on such results support our hypothesis that the proposed method is a promising scheme for simultaneous solving community detection problems over graphs.
Inverse Classification with Limited Budget and Maximum Number of Perturbed Samples
Koo, Jaehoon, Klabjan, Diego, Utke, Jean
Most recent machine learning research focuses on developing new classifiers for the sake of improving classification accuracy. With many well-performing state-of-the-art classifiers available, there is a growing need for understanding interpretability of a classifier necessitated by practical purposes such as to find the best diet recommendation for a diabetes patient. Inverse classification is a post modeling process to find changes in input features of samples to alter the initially predicted class. It is useful in many business applications to determine how to adjust a sample input data such that the classifier predicts it to be in a desired class. In real world applications, a budget on perturbations of samples corresponding to customers or patients is usually considered, and in this setting, the number of successfully perturbed samples is key to increase benefits. In this study, we propose a new framework to solve inverse classification that maximizes the number of perturbed samples subject to a per-feature-budget limits and favorable classification classes of the perturbed samples. We design algorithms to solve this optimization problem based on gradient methods, stochastic processes, Lagrangian relaxations, and the Gumbel trick. In experiments, we find that our algorithms based on stochastic processes exhibit an excellent performance in different budget settings and they scale well.
Online Structural Change-point Detection of High-dimensional Streaming Data via Dynamic Sparse Subspace Learning
Xu, Ruiyu, Wu, Jianguo, Yue, Xiaowei, Li, Yongxiang
High-dimensional streaming data are becoming increasingly ubiquitous in many fields. They often lie in multiple low-dimensional subspaces, and the manifold structures may change abruptly on the time scale due to pattern shift or occurrence of anomalies. However, the problem of detecting the structural changes in a real-time manner has not been well studied. To fill this gap, we propose a dynamic sparse subspace learning (DSSL) approach for online structural change-point detection of high-dimensional streaming data. A novel multiple structural change-point model is proposed and it is shown to be equivalent to maximizing a posterior under certain conditions. The asymptotic properties of the estimators are investigated. The penalty coefficients in our model can be selected by AMDL criterion based on some historical data. An efficient Pruned Exact Linear Time (PELT) based method is proposed for online optimization and change-point detection. The effectiveness of the proposed method is demonstrated through a simulation study and a real case study using gesture data for motion tracking.