AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

A fast randomized incremental gradient method for decentralized non-convex optimization

arXiv.org Machine LearningNov-7-2020

We study decentralized non-convex finite-sum minimization problems described over a network of nodes, where each node possesses a local batch of data samples. We propose a single-timescale first-order randomized incremental gradient method, termed as GT-SAGA. GT-SAGA is computationally efficient since it evaluates only one component gradient per node per iteration and achieves provably fast and robust performance by leveraging node-level variance reduction and network-level gradient tracking. For general smooth non-convex problems, we show almost sure and mean-squared convergence to a first-order stationary point and describe regimes of practical significance where GT-SAGA achieves a network-independent convergence rate and outperforms the existing approaches respectively. When the global cost function further satisfies the Polyak-Lojaciewisz condition, we show that GT-SAGA exhibits global linear convergence to an optimal solution in expectation and describe regimes of practical interest where the performance is network-independent and improves upon the existing work. Numerical experiments based on real-world datasets are included to highlight the behavior and convergence aspects of the proposed method.

gradient, gt-saga, optimization, (15 more...)

arXiv.org Machine Learning

2011.03853

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Massachusetts > Middlesex County > Medford (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Stochastic Hill Climbing in Python from Scratch - DLTK.AI

#artificialintelligenceNov-6-2020, 07:35:40 GMT

Stochastic Hill climbing is an optimization algorithm. It makes use of randomness as part of the search process. This makes the algorithm appropriate for nonlinear objective functions where other local search algorithms do not operate well. It is also a local search algorithm, meaning that it modifies a single solution and searches the relatively local area of the search space until the local optima is located. This means that it is appropriate for unimodal optimization problems or for use after the application of a global optimization algorithm.

algorithm, optimization algorithm, stochastic hill climbing, (13 more...)

#artificialintelligence

Genre: Instructional Material (0.35)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Sparse Approximate Solutions to Max-Plus Equations with Application to Multivariate Convex Regression

Tsilivis, Nikos, Tsiamis, Anastasios, Maragos, Petros

arXiv.org Machine LearningNov-6-2020

R { } is equipped with the standard maximum and sum operations, respectively. It has been used to represent various nonlinear processes, in areas such as scheduling and synchronization [2], [6], [9], geometry [22], control theory and optimization [1], [4], morphological image and signal analysis [15], [24], [28], and machine learning [7], [8], [29], [32], [33]. Max-plus algebra is obtained from the conventional linear algebra if we replace addition with maximum and multiplication with addition, as an extension of the max-plus semiring to multiple dimensions. Hence, many of the aforementioned nonlinear processes enjoy some linear-like properties when described in terms of the max-plus algebra. In this paper we are interested in sparse max-plus representations, i.e. vectors which consist of as many uninformative () elements as possible.

affine region, approximation, equation, (16 more...)

arXiv.org Machine Learning

2011.04468

Country:

North America > United States > Pennsylvania (0.04)
Europe > Greece > Attica > Athens (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Feature Removal Is a Unifying Principle for Model Explanation Methods

Covert, Ian, Lundberg, Scott, Lee, Su-In

arXiv.org Machine LearningNov-6-2020

Researchers have proposed a wide variety of model explanation approaches, but it remains unclear how most methods are related or when one method is preferable to another. We examine the literature and find that many methods are based on a shared principle of explaining by removing - essentially, measuring the impact of removing sets of features from a model. These methods vary in several respects, so we develop a framework for removal-based explanations that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence. Our framework unifies 25 existing methods, including several of the most widely used approaches (SHAP, LIME, Meaningful Perturbations, permutation tests). Exposing the fundamental similarities between these methods empowers users to reason about which tools to use and suggests promising directions for ongoing research in model explainability.

arxiv preprint arxiv, explanation, remove feature, (15 more...)

arXiv.org Machine Learning

2011.03623

Country:

North America > United States > Washington > King County > Seattle (0.14)
Asia > Middle East > Jordan (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)
North America > United States > Washington > King County > Redmond (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Ridge Regression with Frequent Directions: Statistical and Optimization Perspectives

Dickens, Charlie

arXiv.org Machine LearningNov-6-2020

Despite its impressive theory \& practical performance, Frequent Directions (\acrshort{fd}) has not been widely adopted for large-scale regression tasks. Prior work has shown randomized sketches (i) perform worse in estimating the covariance matrix of the data than \acrshort{fd}; (ii) incur high error when estimating the bias and/or variance on sketched ridge regression. We give the first constant factor relative error bounds on the bias \& variance for sketched ridge regression using \acrshort{fd}. We complement these statistical results by showing that \acrshort{fd} can be used in the optimization setting through an iterative scheme which yields high-accuracy solutions. This improves on randomized approaches which need to compromise the need for a new sketch every iteration with speed of convergence. In both settings, we also show using \emph{Robust Frequent Directions} further enhances performance.

frequent direction, matrix, sketch, (11 more...)

arXiv.org Machine Learning

2011.03607

Country: South America > Paraguay > Asunción > Asunción (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)

Add feedback

Accelerating combinatorial filter reduction through constraints

Zhang, Yulin, Rahmani, Hazhar, Shell, Dylan A., O'Kane, Jason M.

arXiv.org Artificial IntelligenceNov-6-2020

Reduction of combinatorial filters involves compressing state representations that robots use. Such optimization arises in automating the construction of minimalist robots. But exact combinatorial filter reduction is an NP-complete problem and all current techniques are either inexact or formalized with exponentially many constraints. This paper proposes a new formalization needing only a polynomial number of constraints, and characterizes these constraints in three different forms: nonlinear, linear, and conjunctive normal form. Empirical results show that constraints in conjunctive normal form capture the problem most effectively, leading to a method that outperforms the others. Further examination indicates that a substantial proportion of constraints remain inactive during iterative filter reduction. To leverage this observation, we introduce just-in-time generation of such constraints, which yields improvements in efficiency and has the potential to minimize large filters.

constraint, vertex, vertex cover, (16 more...)

arXiv.org Artificial Intelligence

2011.03471

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > South Carolina > Richland County > Columbia (0.14)
Oceania > Australia > Queensland > Brisbane (0.04)
(8 more...)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)

Add feedback

Stochastic Approximation for High-frequency Observations in Data Assimilation

Zhang, Shushu, Patel, Vivak

arXiv.org Machine LearningNov-5-2020

With the increasing penetration of high-frequency sensors across a number of biological and physical systems, the abundance of the resulting observations offers opportunities for higher statistical accuracy of down-stream estimates, but their frequency results in a plethora of computational problems in data assimilation tasks. The high-frequency of these observations has been traditionally dealt with by using data modification strategies such as accumulation, averaging, and sampling. However, these data modification strategies will reduce the quality of the estimates, which may be untenable for many systems. Therefore, to ensure high-quality estimates, we adapt stochastic approximation methods to address the unique challenges of high-frequency observations in data assimilation. As a result, we are able to produce estimates that leverage all of the observations in a manner that avoids the aforementioned computational problems and preserves the statistical accuracy of the estimates.

artificial intelligence, optimization problem, upstream oil & gas, (11 more...)

arXiv.org Machine Learning

2011.02672

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.49)

Add feedback

Efficient Hyperparameter Tuning with Dynamic Accuracy Derivative-Free Optimization

Ehrhardt, Matthias J., Roberts, Lindon

arXiv.org Machine LearningNov-5-2020

Many machine learning solutions are framed as optimization problems which rely on good hyperparameters. Algorithms for tuning these hyperparameters usually assume access to exact solutions to the underlying learning problem, which is typically not practical. Here, we apply a recent dynamic accuracy derivative-free optimization method to hyperparameter tuning, which allows inexact evaluations of the learning problem while retaining convergence guarantees. We test the method on the problem of learning elastic net weights for a logistic classifier, and demonstrate its robustness and efficiency compared to a fixed accuracy approach. This demonstrates a promising approach for hyperparameter tuning, with both convergence guarantees and practical performance.

accuracy, algorithm, optimization, (13 more...)

arXiv.org Machine Learning

2011.03151

Country:

Europe > Switzerland (0.04)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Industry: Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On stochastic mirror descent with interacting particles: convergence properties and variance reduction

Borovykh, Anastasia, Kantas, Nikolas, Parpas, Panos, Pavliotis, Grigorios A.

arXiv.org Machine LearningNov-5-2020

An open problem in optimization with noisy information is the computation of an exact minimizer that is independent of the amount of noise. A standard practice in stochastic approximation algorithms is to use a decreasing step-size. This however leads to a slower convergence. A second alternative is to use a fixed step-size and run independent replicas of the algorithm and average these. A third option is to run replicas of the algorithm and allow them to interact. It is unclear which of these options works best. To address this question, we reduce the problem of the computation of an exact minimizer with noisy gradient information to the study of stochastic mirror descent with interacting particles. We study the convergence of stochastic mirror descent and make explicit the tradeoffs between communication and variance reduction. We provide theoretical and numerical evidence to suggest that interaction helps to improve convergence and reduce the variance of the estimate.

convergence, descent, particle, (15 more...)

arXiv.org Machine Learning

2007.07704

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Mathematics of Computing (0.66)

Add feedback

Simple and optimal methods for stochastic variational inequalities, I: operator extrapolation

Kotsalis, Georgios, Lan, Guanghui, Li, Tianjiao

arXiv.org Artificial IntelligenceNov-5-2020

In this paper we first present a novel operator extrapolation (OE) method for solving deterministic variational inequality (VI) problems. Similar to the gradient (operator) projection method, OE updates one single search sequence by solving a single projection subproblem in each iteration. We show that OE can achieve the optimal rate of convergence for solving a variety of VI problems in a much simpler way than existing approaches. We then introduce the stochastic operator extrapolation (SOE) method and establish its optimal convergence behavior for solving different stochastic VI problems. In particular, SOE achieves the optimal complexity for solving a fundamental problem, i.e., stochastic smooth and strongly monotone VI, for the first time in the literature. We also present a stochastic block operator extrapolations (SBOE) method to further reduce the iteration cost for the OE method applied to large-scale deterministic VIs with a certain block structure. Numerical experiments have been conducted to demonstrate the potential advantages of the proposed algorithms. In fact, all these algorithms are applied to solve generalized monotone variational inequality (GMVI) problems whose operator is not necessarily monotone. We will also discuss optimal OE-based policy evaluation methods for reinforcement learning in a companion paper.

algorithm, inequality, monotone, (15 more...)

arXiv.org Artificial Intelligence

2011.02987

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback