AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Uncertainty Principle based optimization; new metaheuristics framework

Moattari, Mojtaba, Moradi, Mohammad Hassan, Roshandel, Emad

arXiv.org Artificial IntelligenceJun-2-2020

To more flexibly balance between exploration and exploitation, a new meta-heuristic method based on Uncertainty Principle concepts is proposed in this paper. UP is is proved effective in multiple branches of science. In the branch of quantum mechanics, canonically conjugate observables such as position and momentum cannot both be distinctly determined in any quantum state. In the same manner, the branch of Spectral filtering design implies that a nonzero function and its Fourier transform cannot both be sharply localized. After delving into such concepts on Uncertainty Principle and their variations in quantum physics, Fourier analysis, and wavelet design, the proposed framework is described in terms of algorithm and flowchart. Our proposed optimizer's idea is based on an inherent uncertainty in performing local search versus global solution search. A set of compatible metrics for each part of the framework is proposed to derive preferred form of algorithm. Evaluations and comparisons at the end of paper show competency and distinct capability of the algorithm over some of the well-known and recently proposed metaheuristics.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2006.09981

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Europe > Netherlands (0.04)
Europe > Finland > Paijanne Tavastia > Lahti (0.04)
Asia > Middle East > Iran > Fars Province > Shiraz (0.04)

Genre: Research Report (1.00)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Maximizing Cumulative User Engagement in Sequential Recommendation: An Online Optimization Perspective

Zhao, Yifei, Zhou, Yu-Hang, Ou, Mingdong, Xu, Huan, Li, Nan

arXiv.org Artificial IntelligenceJun-2-2020

To maximize cumulative user engagement (e.g. cumulative clicks) in sequential recommendation, it is often needed to tradeoff two potentially conflicting objectives, that is, pursuing higher immediate user engagement (e.g., click-through rate) and encouraging user browsing (i.e., more items exposured). Existing works often study these two tasks separately, thus tend to result in sub-optimal results. In this paper, we study this problem from an online optimization perspective, and propose a flexible and practical framework to explicitly tradeoff longer user browsing length and high immediate user engagement. Specifically, by considering items as actions, user's requests as states and user leaving as an absorbing state, we formulate each user's behavior as a personalized Markov decision process (MDP), and the problem of maximizing cumulative user engagement is reduced to a stochastic shortest path (SSP) problem. Meanwhile, with immediate user engagement and quit probability estimation, it is shown that the SSP problem can be efficiently solved via dynamic programming. Experiments on real-world datasets demonstrate the effectiveness of the proposed approach. Moreover, this approach is deployed at a large E-commerce platform, achieved over 7% improvement of cumulative clicks.

artificial intelligence, machine learning, user engagement, (18 more...)

arXiv.org Artificial Intelligence

2006.0452

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(12 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Acceleration of Descent-based Optimization Algorithms via Carath\'eodory's Theorem

Cosentino, Francesco, Oberhauser, Harald, Abate, Alessandro

arXiv.org Machine LearningJun-2-2020

We propose a new technique to accelerate algorithms based on Gradient Descent using Carath\'eodory's Theorem. In the case of the standard Gradient Descent algorithm, we analyse the theoretical convergence of the approach under convexity assumptions and empirically display its ameliorations. As a core contribution, we then present an application of the acceleration technique to Block Coordinate Descent methods. Experimental comparisons on least squares regression with a LASSO regularisation term show remarkably improved performance on LASSO than the ADAM and SAG algorithms.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

2006.01819

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > Denmark > North Jutland (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

A modification of quasi-Newton's methods helping to avoid saddle points

Truong, Tuyen Trung, To, Tat Dat, Nguyen, Tuan Hang, Nguyen, Thu Hang, Nguyen, Hoang Phuong, Helmy, Maged

arXiv.org Machine LearningJun-2-2020

We recall that if $A$ is an invertible and symmetric real $m\times m$ matrix, then it is diagonalisable. Therefore, if we denote by $\mathcal{E}^{+}(A)\subset \mathbb{R}^m$ (respectively $\mathcal{E}^{-}(A)\subset \mathbb{R}^m$) to be the vector subspace generated by eigenvectors with positive eigenvalues of $A$ (correspondingly the vector subspace generated by eigenvectors with negative eigenvalues of $A$), then we have an orthogonal decomposition $\mathbb{R}^m=\mathcal{E}^{+}(A)\oplus \mathcal{E}^{-}(A)$. Hence, every $x\in \mathbb{R}^m$ can be written uniquely as $x=pr_{A,+}(x)+pr_{A,-}(x)$ with $pr_{A,+}(x)\in \mathcal{E}^{+}(A)$ and $pr_{A,-}(x)\in \mathcal{E}^{-}(A)$. We propose the following simple new modification of quasi-Newton's methods. {\bf New Q-Newton's method.} Let $\Delta =\{\delta _0,\delta _1,\delta _2,\ldots \}$ be a countable set of real numbers which has at least $m+1$ elements. Let $f:\mathbb{R}^m\rightarrow \mathbb{R}$ be a $C^2$ function. Let $\alpha >0$. For each $x\in \mathbb{R}^m$ such that $\nabla f(x)\not=0$, let $\delta (x)=\delta _j$, where $j$ is the smallest number so that $\nabla ^2f(x)+\delta _j||\nabla f(x)||^{1+\alpha}Id$ is invertible. (If $\nabla f(x)=0$, then we choose $\delta (x)=\delta _0$.) Let $x_0\in \mathbb{R}^m$ be an initial point. We define a sequence of $x_n\in \mathbb{R}^m$ and invertible and symmetric $m\times m$ matrices $A_n$ as follows: $A_n=\nabla ^2f(x_n)+\delta (x_n) ||\nabla f(x_n)||^{1+\alpha}Id$ and $x_{n+1}=x_n-w_n$, where $w_n=pr_{A_n,+}(v_n)-pr_{A_n,-}(v_n)$ and $v_n=A_n^{-1}\nabla f(x_n)$. The main result of this paper roughly says that if $f$ is $C^3$ and a sequence $\{x_n\}$, constructed by the New Q-Newton's method from a random initial point $x_0$, {\bf converges}, then the limit point is not a saddle point, and the convergence rate is the same as that of Newton's method.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Machine Learning

2006.01512

Country:

Europe > Norway > Eastern Norway > Oslo (0.05)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Recht-R\'e Noncommutative Arithmetic-Geometric Mean Conjecture is False

Lai, Zehua, Lim, Lek-Heng

arXiv.org Machine LearningJun-2-2020

Stochastic optimization algorithms have become indispensable in modern machine learning. An unresolved foundational question in this area is the difference between with-replacement sampling and without-replacement sampling -- does the latter have superior convergence rate compared to the former? A groundbreaking result of Recht and R\'e reduces the problem to a noncommutative analogue of the arithmetic-geometric mean inequality where $n$ positive numbers are replaced by $n$ positive definite matrices. If this inequality holds for all $n$, then without-replacement sampling indeed outperforms with-replacement sampling. The conjectured Recht-R\'e inequality has so far only been established for $n = 2$ and a special case of $n = 3$. We will show that the Recht-R\'e conjecture is false for general $n$. Our approach relies on the noncommutative Positivstellensatz, which allows us to reduce the conjectured inequality to a semidefinite program and the validity of the conjecture to certain bounds for the optimum values, which we show are false as soon as $n = 5$.

artificial intelligence, inequality, machine learning, (15 more...)

arXiv.org Machine Learning

2006.0151

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Add feedback

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

Yao, Zhewei, Gholami, Amir, Shen, Sheng, Keutzer, Kurt, Mahoney, Michael W.

arXiv.org Machine LearningJun-1-2020

We introduce AdaHessian, a second order stochastic optimization algorithm which dynamically incorporates the curvature of the loss function via ADAptive estimates of the Hessian. Second order algorithms are among the most powerful optimization algorithms with superior convergence properties as compared to first order methods such as SGD and ADAM. The main disadvantage of traditional second order methods is their heavier per-iteration computation and poor accuracy as compared to first order methods. To address these, we incorporate several novel approaches in AdaHessian, including: (i) a new variance reduction estimate of the Hessian diagonal with low computational overhead; (ii) a root-mean-square exponential moving average to smooth out variations of the Hessian diagonal across different iterations; and (iii) a block diagonal averaging to reduce the variance of Hessian diagonal elements. We show that AdaHessian achieves new state-of-the-art results by a large margin as compared to other adaptive optimization methods, including variants of ADAM. In particular, we perform extensive tests on CV, NLP, and recommendation system tasks and find that AdaHessian: (i) achieves 1.80\%/1.45\% higher accuracy on ResNets20/32 on Cifar10, and 5.55\% higher accuracy on ImageNet as compared to ADAM; (ii) outperforms ADAMW for transformers by 0.27/0.33 BLEU score on IWSLT14/WMT14 and 1.8/1.0 PPL on PTB/Wikitext-103; and (iii) achieves 0.032\% better score than AdaGrad for DLRM on the Criteo Ad Kaggle dataset. Importantly, we show that the cost per iteration of AdaHessian is comparable to first-order methods, and that it exhibits robustness towards its hyperparameters. The code for AdaHessian is open-sourced and publicly available.

adamw, essian, hessian diagonal, (12 more...)

arXiv.org Machine Learning

2006.00719

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Education (0.46)
Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

Dynamic Bidding Strategies with Multivariate Feedback Control for Multiple Goals in Display Advertising

Tashman, Michael, Xie, Jiayi, Hoffman, John, Winikor, Lee, Gerami, Rouzbeh

arXiv.org Machine LearningJun-1-2020

Real-Time Bidding (RTB) display advertising is a method for purchasing display advertising inventory in auctions that occur within milliseconds. The performance of RTB campaigns is generally measured with a series of Key Performance Indicators (KPIs) - measurements used to ensure that the campaign is cost-effective and that it is purchasing valuable inventory. While an RTB campaign should ideally meet all KPIs, simultaneous improvement tends to be very challenging, as an improvement to any one KPI risks a detrimental effect toward the others. Here we present an approach to simultaneously controlling multiple KPIs with a PID-based feedback-control system. This method generates a control score for each KPI, based on both the output of a PID controller module and a metric that quantifies the importance of each KPI for internal business needs. On regular intervals, this algorithm - Sequential Control - will choose the KPI with the greatest overall need for improvement. In this way, our algorithm is able to continually seek the greatest marginal improvements to its current state. Multiple methods of control can be associated with each KPI, and can be triggered either simultaneously or chosen stochastically, in order to avoid local optima. In both offline ad bidding simulations and testing on live traffic, our methods proved to be effective in simultaneously controlling multiple KPIs, and bringing them toward their respective goals.

artificial intelligence, kpi, optimization problem, (16 more...)

arXiv.org Machine Learning

2007.00426

Country:

North America > United States > New York > New York County > New York City (0.06)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report (0.83)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Robust Reinforcement Learning with Wasserstein Constraint

Hou, Linfang, Pang, Liang, Hong, Xin, Lan, Yanyan, Ma, Zhiming, Yin, Dawei

arXiv.org Machine LearningJun-1-2020

Robust Reinforcement Learning aims to find the optimal policy with some extent of robustness to environmental dynamics. Existing learning algorithms usually enable the robustness through disturbing the current state or simulating environmental parameters in a heuristic way, which lack quantified robustness to the system dynamics (i.e. transition probability). To overcome this issue, we leverage Wasserstein distance to measure the disturbance to the reference transition kernel. With Wasserstein distance, we are able to connect transition kernel disturbance to the state disturbance, i.e. reduce an infinite-dimensional optimization problem to a finite-dimensional risk-aware problem. Through the derived risk-aware optimal Bellman equation, we show the existence of optimal robust policies, provide a sensitivity analysis for the perturbations, and then design a novel robust learning algorithm--Wasserstein Robust Advantage Actor-Critic algorithm (WRAAC). The effectiveness of the proposed algorithm is verified in the Cart-Pole environment.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2006.00945

Country:

Asia > China > Beijing > Beijing (0.05)
Europe > Romania > Sud-Est Development Region > Tulcea County > Tulcea (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Bayesian Optimisation vs. Input Uncertainty Reduction

Ungredda, Juan, Pearce, Michael, Branke, Juergen

arXiv.org Machine LearningMay-31-2020

Simulators often require calibration inputs estimated from real world data and the quality of the estimate can significantly affect simulation output. Particularly when performing simulation optimisation to find an optimal solution, the uncertainty in the inputs significantly affects the quality of the found solution. One remedy is to search for the solution that has the best performance on average over the uncertain range of inputs yielding an optimal compromise solution. We consider the more general setting where a user may choose between either running simulations or instead collecting real world data. A user may choose an input and a solution and observe the simulation output, or instead query an external data source improving the input estimate enabling the search for a more focused, less compromised solution. We explicitly examine the trade-off between simulation and real data collection in order to find the optimal solution of the simulator with the true inputs. Using a value of information procedure, we propose a novel unified simulation optimisation procedure called Bayesian Information Collection and Optimisation (BICO) that, in each iteration, automatically determines which of the two actions (running simulations or data collection) is more beneficial. Numerical experiments demonstrate that the proposed algorithm is able to automatically determine an appropriate balance between optimisation and data collection.

bayesian inference, input uncertainty, optimization problem, (17 more...)

arXiv.org Machine Learning

2006.00643

Country: North America > United States > Nevada (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Global Convergence of MAML for LQR

Molybog, Igor, Lavaei, Javad

arXiv.org Machine LearningMay-31-2020

The paper studies the performance of the Model-Agnostic Meta-Learning (MAML) algorithm as an optimization method. The goal is to determine the global convergence of MAML on sequential decision-making tasks possessing a common structure. We prove that the benign landscape of a single task leads to the global convergence of MAML in the single-task scenario and in the scenario of multiple structurally connected tasks. We also show that there is a two-task scenario that does not possess this global convergence property even for identical tasks. We analyze the landscape of the MAML objective on LQR tasks to determine what type of similarities in their structures enables the algorithm to converge to the globally optimal solution.

artificial intelligence, machine learning, objective, (17 more...)

arXiv.org Machine Learning

2006.00453

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback