AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization

Chen, Xiangyi, Liu, Sijia, Xu, Kaidi, Li, Xingguo, Lin, Xue, Hong, Mingyi, Cox, David

arXiv.org Machine LearningOct-15-2019

The adaptive momentum method (AdaMM), which uses past gradients to update descent directions and learning rates simultaneously, has become one of the most popular first-order optimization methods for solving machine learning problems. However, AdaMM is not suited for solving black-box optimization problems, where explicit gradient forms are difficult or infeasible to obtain. In this paper, we propose a zeroth-order AdaMM (ZO-AdaMM) algorithm, that generalizes AdaMM to the gradient-free regime. We show that the convergence rate of ZO-AdaMM for both convex and nonconvex optimization is roughly a factor of $O(\sqrt{d})$ worse than that of the first-order AdaMM algorithm, where $d$ is problem size. In particular, we provide a deep understanding on why Mahalanobis distance matters in convergence of ZO-AdaMM and other AdaMM-type methods. As a byproduct, our analysis makes the first step toward understanding adaptive learning rate methods for nonconvex constrained optimization. Furthermore, we demonstrate two applications, designing per-image and universal adversarial attacks from black-box neural networks, respectively. We perform extensive experiments on ImageNet and empirically show that ZO-AdaMM converges much faster to a solution of high accuracy compared with $6$ state-of-the-art ZO optimization methods.

algorithm, optimization, zo-adamm, (13 more...)

arXiv.org Machine Learning

1910.06513

Country:

North America > United States > Minnesota (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Transportation > Air (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Negatively Correlated Search as a Parallel Exploration Search Strategy

Yang, Peng, Tang, Ke, Yao, Xin

arXiv.org Artificial IntelligenceOct-15-2019

Parallel exploration is a key to a successful search. The recently proposed Negatively Correlated Search (NCS) achieved this ability by constructing a set of negatively correlated search processes and has been applied to many real-world problems. In NCS, the key technique is to explicitly model and maximize the diversity among search processes in parallel. However, the original diversity model was mostly devised by intuition, which introduced several drawbacks to NCS. In this paper, a mathematically principled diversity model is proposed to solve the existing drawbacks of NCS, resulting a new NCS framework. A new instantiation of NCS is also derived and its effectiveness is verified on a set of multi-modal continuous optimization problems.

diversity model, fitness value, sub, (15 more...)

arXiv.org Artificial Intelligence

1910.07151

Country:

North America > United States (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
Oceania > New Zealand (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback

Autonomous Aerial Cinematography In Unstructured Environments With Learned Artistic Decision-Making

Bonatti, Rogerio, Wang, Wenshan, Ho, Cherie, Ahuja, Aayush, Gschwindt, Mirko, Camci, Efe, Kayacan, Erdal, Choudhury, Sanjiban, Scherer, Sebastian

arXiv.org Artificial IntelligenceOct-15-2019

Aerial cinematography is revolutionizing industries that require live and dynamic camera viewpoints such as entertainment, sports, and security. However, safely piloting a drone while filming a moving target in the presence of obstacles is immensely taxing, often requiring multiple expert human operators. Hence, there is demand for an autonomous cinematographer that can reason about both geometry and scene context in real-time. Existing approaches do not address all aspects of this problem; they either require high-precision motion-capture systems or GPS tags to localize targets, rely on prior maps of the environment, plan for short time horizons, or only follow artistic guidelines specified before flight. In this work, we address the problem in its entirety and propose a complete system for real-time aerial cinematography that for the first time combines: (1) vision-based target estimation; (2) 3D signed-distance mapping for occlusion estimation; (3) efficient trajectory optimization for long time-horizon camera motion; and (4) learning-based artistic shot selection. We extensively evaluate our system both in simulation and in field experiments by filming dynamic targets moving through unstructured environments. Our results indicate that our system can operate reliably in the real world without restrictive assumptions. We also provide in-depth analysis and discussions for each module, with the hope that our design tradeoffs can generalize to other related applications. Videos of the complete system can be found at: https://youtu.be/ookhHnqmlaU.

actor, obstacle, trajectory, (17 more...)

arXiv.org Artificial Intelligence

1910.06988

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
(6 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Graphics (1.00)
(8 more...)

Add feedback

BoTorch: Programmable Bayesian Optimization in PyTorch

Balandat, Maximilian, Karrer, Brian, Jiang, Daniel R., Daulton, Samuel, Letham, Benjamin, Wilson, Andrew Gordon, Bakshy, Eytan

arXiv.org Machine LearningOct-14-2019

Bayesian optimization provides sample-efficient global optimization for a broad range of applications, including automatic machine learning, molecular chemistry, and experimental design. We introduce BoTorch, a modern programming framework for Bayesian optimization. Enabled by Monte-Carlo (MC) acquisition functions and auto-differentiation, BoTorch's modular design facilitates flexible specification and optimization of probabilistic models written in PyTorch, radically simplifying implementation of novel acquisition functions. Our MC approach is made practical by a distinctive algorithmic foundation that leverages fast predictive distributions and hardware acceleration. In experiments, we demonstrate the improved sample efficiency of BoTorch relative to other popular libraries. BoTorch is open source and available at https://github.com/pytorch/botorch.

acquisition function, optimization, orch, (13 more...)

arXiv.org Machine Learning

1910.06403

Country:

Africa > Nigeria (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning

Karimireddy, Sai Praneeth, Kale, Satyen, Mohri, Mehryar, Reddi, Sashank J., Stich, Sebastian U., Suresh, Ananda Theertha

arXiv.org Machine LearningOct-14-2019

Federated learning is a key scenario in modern large-scale m achine learning. In that scenario, the training data remains distributed over a larg e number of clients, which may be phones, other mobile devices, or network sensors and a centr alized model is learned without ever transmitting client data over the network. The standar d optimization algorithm used in this scenario is Federated A veraging (FedA vg). However, when client data is heterogeneous, which is typical in applications, FedA vg does not a dmit a favorable convergence guarantee. This is because local updates on clients can drif t apart, which also explains the slow convergence and hard-to-tune nature of FedA vg in pract ice. This paper presents a new Stochastic Controlled A veraging algorithm ( SCAFFOLD) which uses control variates to reduce the drift between different clients. We prove that the algorithm requires significantly fewer rounds of communication and benefits from favorable co nvergence guarantees.

algorithm, arxiv preprint arxiv, control variate, (14 more...)

arXiv.org Machine Learning

1910.06378

Country:

North America > United States > New York (0.04)
North America > United States > Virginia (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Loss Landscape Sightseeing with Multi-Point Optimization

Skorokhodov, Ivan, Burtsev, Mikhail

arXiv.org Machine LearningOct-14-2019

We present multi-point optimization: an optimization technique that allows to train several models simultaneously without the need to keep the parameters of each one individually. The proposed method is used for a thorough empirical analysis of the loss landscape of neural networks. By extensive experiments on FashionMNIST and CIFAR10 datasets we demonstrate two things: 1) loss surface is surprisingly diverse and intricate in terms of landscape patterns it contains, and 2) adding batch normalization makes it more smooth. Source code to reproduce all the reported results is available on GitHub: https://github.com/universome/loss-patterns.

batch normalization, landscape, loss surface, (13 more...)

arXiv.org Machine Learning

1910.03867

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Global-Local Metamodel Assisted Two-Stage Optimization via Simulation

Xie, Wei, Yi, Yuan, Zheng, Hua

arXiv.org Machine LearningOct-13-2019

To integrate strategic, tactical and operational decisions, the two-stage optimization has been widely used to guide dynamic decision making. In this paper, we study the two-stage stochastic programming for complex systems with unknown response estimated by simulation. We introduce the global-local metamodel assisted two-stage optimization via simulation that can efficiently employ the simulation resource to iteratively solve for the optimal first- and second-stage decisions. Specifically, at each visited first-stage decision, we develop a local metamodel to simultaneously solve a set of scenario-based second-stage optimization problems, which also allows us to estimate the optimality gap. Then, we construct a global metamodel accounting for the errors induced by: (1) using a finite number of scenarios to approximate the expected future cost occurring in the planning horizon, (2) second-stage optimality gap, and (3) finite visited first-stage decisions. Assisted by the global-local metamodel, we propose a new simulation optimization approach that can efficiently and iteratively search for the optimal first- and second-stage decisions. Our framework can guarantee the convergence of optimal solution for the discrete two-stage optimization with unknown objective, and the empirical study indicates that it achieves substantial efficiency and accuracy.

metamodel, optimization, optimization problem, (16 more...)

arXiv.org Machine Learning

1910.05863

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre:

Research Report (1.00)
Overview (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Energy > Renewable (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

AdaWISH: Faster Discrete Integration via Adaptive Quantiles

Ding, Fan, Wang, Hanjing, Sabharwal, Ashish, Xue, Yexiang

arXiv.org Machine LearningOct-13-2019

Discrete integration in a high dimensional space of $n$ variables poses fundamental challenges. The WISH algorithm reduces the intractable discrete integration problem into $n$ optimization queries subject to randomized constraints, obtaining a constant approximation guarantee. The optimization queries are expensive, which limits the applicability of WISH. We propose AdaWISH, which is able to obtain the same guarantee, but accesses only a small subset of queries of WISH. For example, when the number of function values is bounded by a constant, AdaWISH issues only $O(\log n)$ queries. The key idea is to query adaptively, taking advantage of the shape of the weight function. In general, we prove that AdaWISH has a regret of no more than $O(\log n)$ relative to an oracle that issues queries at data-dependent optimal points. Experimentally, AdaWISH gives precise estimates for discrete integration problems, of the same quality as that of WISH and better than several competing approaches, on a variety of probabilistic inference benchmarks, while saving substantially on the number of optimization queries compared to WISH. For example, it saves $81.5\%$ of WISH queries while retaining the quality of results on a suite of UAI inference challenge benchmarks.

adawish, algorithm, query, (14 more...)

arXiv.org Machine Learning

1910.05811

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

A preference learning framework for multiple criteria sorting with diverse additive value models and valued assignment examples

Liu, Jiapeng, Kadzinski, Milosz, Liao, Xiuwu, Mao, Xiaoxin, Wang, Yao

arXiv.org Machine LearningOct-12-2019

We present a preference learning framework for multiple criteria sorting. We consider sorting procedures applying an additive value model with diverse types of marginal value functions (including linear, piecewise-linear, splined, and general monotone ones) under a unified analytical framework. Differently from the existing sorting methods that infer a preference model from crisp decision examples, where each reference alternative is assigned to a unique class, our framework allows to consider valued assignment examples in which a reference alternative can be classified into multiple classes with respective credibility degrees. We propose an optimization model for constructing a preference model from such valued examples by maximizing the credible consistency among reference alternatives. To improve the predictive ability of the constructed model on new instances, we employ the regularization techniques. Moreover, to enhance the capability of addressing large-scale datasets, we introduce a state-of-the-art algorithm that is widely used in the machine learning community to solve the proposed optimization model in a computationally efficient way. Using the constructed additive value model, we determine both crisp and valued assignments for non-reference alternatives. Moreover, we allow the Decision Maker to prioritize importance of classes and give the method a flexibility to adjust classification performance across classes according to the specified priorities. The practical usefulness of the analytical framework is demonstrated on a real-world dataset by comparing it to several existing sorting methods.

decision example, marginal value function, value function, (15 more...)

arXiv.org Machine Learning

1910.05485

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Poland > Greater Poland Province > Poznań (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)

Add feedback

Efficient Inference and Exploration for Reinforcement Learning

Zhu, YI, Dong, Jing, Lam, Henry

arXiv.org Machine LearningOct-11-2019

Despite an ever growing literature on reinforcement learning algorithms and applications, much less is known about their statistical inference. In this paper, we investigate the large sample behaviors of the Q-value estimates with closed-form characterizations of the asymptotic variances. This allows us to efficiently construct confidence regions for Q-value and optimal value functions, and to develop policies to minimize their estimation errors. This also leads to a policy exploration strategy that relies on estimating the relative discrepancies among the Q estimates. Numerical experiments show superior performances of our exploration strategy than other benchmark approaches.

null, optimization problem, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

1910.05471

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback