AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Multifactorial Cellular Genetic Algorithm (MFCGA): Algorithmic Design, Performance Comparison and Genetic Transferability Analysis

Osaba, Eneko, Martinez, Aritz D., Lobo, Jesus L., Del Ser, Javier, Herrera, Francisco

arXiv.org Artificial IntelligenceMar-24-2020

Multitasking optimization is an incipient research area which is lately gaining a notable research momentum. Unlike traditional optimization paradigm that focuses on solving a single task at a time, multitasking addresses how multiple optimization problems can be tackled simultaneously by performing a single search process. The main objective to achieve this goal efficiently is to exploit synergies between the problems (tasks) to be optimized, helping each other via knowledge transfer (thereby being referred to as Transfer Optimization). Furthermore, the equally recent concept of Evolutionary Multitasking (EM) refers to multitasking environments adopting concepts from Evolutionary Computation as their inspiration for the simultaneous solving of the problems under consideration. As such, EM approaches such as the Multifactorial Evolutionary Algorithm (MFEA) has shown a remarkable success when dealing with multiple discrete, continuous, single-, and/or multi-objective optimization problems. In this work we propose a novel algorithmic scheme for Multifactorial Optimization scenarios - the Multifactorial Cellular Genetic Algorithm (MFCGA) - that hinges on concepts from Cellular Automata to implement mechanisms for exchanging knowledge among problems. We conduct an extensive performance analysis of the proposed MFCGA and compare it to the canonical MFEA under the same algorithmic conditions and over 15 different multitasking setups (encompassing different reference instances of the discrete Traveling Salesman Problem). A further contribution of this analysis beyond performance benchmarking is a quantitative examination of the genetic transferability among the problem instances, eliciting an empirical demonstration of the synergies emerged between the different optimization tasks along the MFCGA search process.

algorithm, mfea, optimization, (13 more...)

arXiv.org Artificial Intelligence

2003.10768

Country:

North America > United States > New York (0.04)
North America > United States > Nevada > Washoe County > Reno (0.04)
Europe > Spain > Basque Country (0.04)
Europe > Spain > Andalusia > Granada Province > Granada (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Born-Again Tree Ensembles

Vidal, Thibaut, Pacheco, Toni, Schiffer, Maximilian

arXiv.org Machine LearningMar-24-2020

The use of machine learning algorithms in finance, medicine, and criminal justice can deeply impact human lives. As a consequence, research into interpretable machine learning has rapidly grown in an attempt to better control and fix possible sources of mistakes and biases. Tree ensembles offer a good prediction quality in various domains, but the concurrent use of multiple trees reduces the interpretability of the ensemble. Against this background, we study born-again tree ensembles, i.e., the process of constructing a single decision tree of minimum size that reproduces the exact same behavior as a given tree ensemble. To find such a tree, we develop a dynamic-programming based algorithm that exploits sophisticated pruning and bounding rules to reduce the number of recursive calls. This algorithm generates optimal born-again trees for many datasets of practical interest, leading to classifiers which are typically simpler and more interpretable without any other form of compromise.

algorithm, ensemble, tree ensemble, (14 more...)

arXiv.org Machine Learning

2003.11132

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States > Illinois (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.46)

Add feedback

Unsupervised Domain Adaptation Through Transferring both the Source-Knowledge and Target-Relatedness Simultaneously

Tian, Qing, Ma, Chuang, Cao, Meng, Chen, Songcan

arXiv.org Machine LearningMar-24-2020

Unsupervised domain adaptation (UDA) is an emerging research topic in the field of machine learning and pattern recognition, which aims to help the learning of unlabeled target domain by transferring knowledge from the source domain. To perform UDA, a variety of methods have been proposed, most of which concentrate on the scenario of single source and single target domain (1S1T). However, in real applications, usually single source domain with multiple target domains are involved (1SmT), which cannot be handled directly by those 1S1T models. Unfortunately, although a few related works on 1SmT UDA have been proposed, nearly none of them model the source domain knowledge and leverage the target-relatedness jointly. To overcome these shortcomings, we herein propose a more general 1SmT UDA model through transferring both the Source-Knowledge and Target-Relatedness, UDA-SKTR for short. In this way, not only the supervision knowledge from the source domain, but also the potential relatedness among the target domains are simultaneously modeled for exploitation in the process of 1SmT UDA. In addition, we construct an alternating optimization algorithm to solve the variables of the proposed model with convergence guarantee. Finally, through extensive experiments on both benchmark and real datasets, we validate the effectiveness and superiority of the proposed method.

dataset, source domain, target domain, (16 more...)

arXiv.org Machine Learning

2003.08051

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.69)
(3 more...)

Add feedback

Diagonal Preconditioning: Theory and Algorithms

Qu, Zhaonan, Ye, Yinyu, Zhou, Zhengyuan

arXiv.org Machine LearningMar-24-2020

Diagonal preconditioning has been a staple technique in optimization and machine learning. It often reduces the condition number of the design or Hessian matrix it is applied to, thereby speeding up convergence. However, rigorous analyses of how well various diagonal preconditioning procedures improve the condition number of the preconditioned matrix and how that translates into improvements in optimization are rare. In this paper, we first provide an analysis of a popular diagonal preconditioning technique based on column standard deviation and its effect on the condition number using random matrix theory. Then we identify a class of design matrices whose condition numbers can be reduced significantly by this procedure. We then study the problem of optimal diagonal preconditioning to improve the condition number of any full-rank matrix and provide a bisection algorithm and a potential reduction algorithm with $O(\log(\frac{1}{\epsilon}))$ iteration complexity, where each iteration consists of an SDP feasibility problem and a Newton update using the Nesterov-Todd direction, respectively. Finally, we extend the optimal diagonal preconditioning algorithm to an adaptive setting and compare its empirical performance at reducing the condition number and speeding up convergence for regression and classification problems with that of another adaptive preconditioning technique, namely batch normalization, that is essential in training machine learning models.

condition number, diagonal, matrix, (16 more...)

arXiv.org Machine Learning

2003.07545

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

COEBA: A Coevolutionary Bat Algorithm for Discrete Evolutionary Multitasking

Osaba, Eneko, Del Ser, Javier, Yang, Xin-She, Iglesias, Andres, Galvez, Akemi

arXiv.org Artificial IntelligenceMar-24-2020

Multitasking optimization is an emerging research field which has attracted lot of attention in the scientific community. The main purpose of this paradigm is how to solve multiple optimization problems or tasks simultaneously by conducting a single search process. The main catalyst for reaching this objective is to exploit possible synergies and complementarities among the tasks to be optimized, helping each other by virtue of the transfer of knowledge among them (thereby being referred to as Transfer Optimization). In this context, Evolutionary Multitasking addresses Transfer Optimization problems by resorting to concepts from Evolutionary Computation for simultaneous solving the tasks at hand. This work contributes to this trend by proposing a novel algorithmic scheme for dealing with multitasking environments. The proposed approach, coined as Coevolutionary Bat Algorithm, finds its inspiration in concepts from both co-evolutionary strategies and the metaheuristic Bat Algorithm. We compare the performance of our proposed method with that of its Multifactorial Evolutionary Algorithm counterpart over 15 different multitasking setups, composed by eight reference instances of the discrete Traveling Salesman Problem. The experimentation and results stemming therefrom support the main hypothesis of this study: the proposed Coevolutionary Bat Algorithm is a promising meta-heuristic for solving Evolutionary Multitasking scenarios.

algorithm, coeba, mfea, (12 more...)

arXiv.org Artificial Intelligence

2003.11628

Country:

Europe > United Kingdom (0.14)
North America > United States > New York (0.04)
North America > United States > New Jersey (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Real-Time Dispatching of Large-Scale Ride-Sharing Systems: Integrating Optimization, Machine Learning, and Model Predictive Control

Riley, Connor, Van Hentenryck, Pascal, Yuan, Enpeng

arXiv.org Artificial IntelligenceMar-24-2020

This paper considers the dispatching of large-scale real-time ride-sharing systems to address congestion issues faced by many cities. The goal is to serve all customers (service guarantees) with a small number of vehicles while minimizing waiting times under constraints on ride duration. This paper proposes an end-to-end approach that tightly integrates a state-of-the-art dispatching algorithm, a machine-learning model to predict zone-to-zone demand over time, and a model predictive control optimization to relocate idle vehicles. Experiments using historic taxi trips in New York City indicate that this integration decreases average waiting times by about 30% over all test cases and reaches close to 55% on the largest instances for high-demand zones.

optimization, relocation, vehicle, (14 more...)

arXiv.org Artificial Intelligence

2003.10942

Country:

North America > United States > New York (0.25)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.64)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Add feedback

Slow and Stale Gradients Can Win the Race

Dutta, Sanghamitra, Wang, Jianyu, Joshi, Gauri

arXiv.org Machine LearningMar-23-2020

Distributed Stochastic Gradient Descent (SGD) when run in a synchronous manner, suffers from delays in runtime as it waits for the slowest workers (stragglers). Asynchronous methods can alleviate stragglers, but cause gradient staleness that can adversely affect the convergence error. In this work, we present a novel theoretical characterization of the speedup offered by asynchronous methods by analyzing the trade-off between the error in the trained model and the actual training runtime(wallclock time). The main novelty in our work is that our runtime analysis considers random straggling delays, which helps us design and compare distributed SGD algorithms that strike a balance between straggling and staleness. We also provide a new error convergence analysis of asynchronous SGD variants without bounded or exponential delay assumptions. Finally, based on our theoretical characterization of the error-runtime trade-off, we propose a method of gradually varying synchronicity in distributed SGD and demonstrate its performance on CIFAR10 dataset.

iteration, runtime, sgd, (12 more...)

arXiv.org Machine Learning

2003.10579

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

Koloskova, Anastasia, Loizou, Nicolas, Boreiri, Sadra, Jaggi, Martin, Stich, Sebastian U.

arXiv.org Machine LearningMar-23-2020

Decentralized stochastic optimization methods have gained a lot of attention recently, mainly because of their cheap per iteration cost, data locality, and their communication-efficiency. In this paper we introduce a unified convergence analysis that covers a large variety of decentralized SGD methods which so far have required different intuitions, have different applications, and which have been developed separately in various communities. Our algorithmic framework covers local SGD updates and synchronous and pairwise gossip updates on adaptive network topology. We derive universal convergence rates for smooth (convex and non-convex) problems and the rates interpolate between the heterogeneous (non-identically distributed data) and iid-data settings, recovering linear convergence rates in many special cases, for instance for over-parametrized models. Our proofs rely on weak assumptions (typically improving over prior work in several aspects) and recover (and improve) the best known complexity results for a host of important scenarios, such as for instance coorperative SGD and federated averaging (local SGD).

optimization, sgd, unified theory, (13 more...)

arXiv.org Machine Learning

2003.10422

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > District of Columbia > Washington (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.47)

Add feedback

Steepest Descent Neural Architecture Optimization: Escaping Local Optimum with Signed Neural Splitting

Wu, Lemeng, Ye, Mao, Lei, Qi, Lee, Jason D., Liu, Qiang

arXiv.org Machine LearningMar-23-2020

We propose signed splitting steepest descent (S3D), which progressively grows neural architectures by splitting critical neurons into multiple copies, following a theoretically-derived optimal scheme. Our algorithm is a generalization of the splitting steepest descent (S2D) of Liu et al. (2019b), but significantly improves over it by incorporating a rich set of new splitting schemes that allow negative output weights. By doing so, we can escape local optima that the original S2D can not escape. Theoretically, we show that our method provably learns neural networks with much smaller sizes than these needed for standard gradient descent in overparameterized regimes. Empirically, our method outperforms S2D and prior arts on various challenging benchmarks, including CIFAR-100, ImageNet and ModelNet40.

descent, neuron, splitting, (11 more...)

arXiv.org Machine Learning

2003.10392

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

A Nonconvex Low-Rank Tensor Completion Model for Spatiotemporal Traffic Data Imputation

Chen, Xinyu, Yang, Jinming, Sun, Lijun

arXiv.org Machine LearningMar-23-2020

Sparsity and missing data problems are very common in spatiotemporal traffic data collected from various sensing systems. Making accurate imputation is critical to many applications in intelligent transportation systems. In this paper, we formulate the missing data imputation problem in spatiotemporal traffic data in a low-rank tensor completion (LRTC) framework and define a novel truncated nuclear norm (TNN) on traffic tensors of location$\times$day$\times$time of day. In particular, we introduce an universal rate parameter to control the degree of truncation on all tensor modes in the proposed LRTC-TNN model, and this allows us to better characterize the hidden patterns in spatiotemporal traffic data. Based on the framework of the Alternating Direction Method of Multipliers (ADMM), we present an efficient algorithm to obtain the optimal solution for each variable. We conduct numerical experiments on four spatiotemporal traffic data sets, and our results show that the proposed LRTC-TNN model outperforms many state-of-the-art imputation models with missing rates/patterns. Moreover, the proposed model also outperforms other baseline models in extreme missing scenarios.

minimization, spatiotemporal traffic data, traffic data, (17 more...)

arXiv.org Machine Learning

2003.10271

Country:

North America > Canada > Quebec > Montreal (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Infrastructure & Services (0.49)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback