AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Cautious Reinforcement Learning via Distributional Risk in the Dual Domain

Zhang, Junyu, Bedi, Amrit Singh, Wang, Mengdi, Koppel, Alec

arXiv.org Artificial IntelligenceFeb-27-2020

We study the estimation of risk-sensitive policies in reinforcement learning problems defined by a Markov Decision Process (MDPs) whose state and action spaces are countably finite. Prior efforts are predominately afflicted by computational challenges associated with the fact that risk-sensitive MDPs are time-inconsistent. To ameliorate this issue, we propose a new definition of risk, which we call caution, as a penalty function added to the dual objective of the linear programming (LP) formulation of reinforcement learning. The caution measures the distributional risk of a policy, which is a function of the policy's long-term state occupancy distribution. To solve this problem in an online model-free manner, we propose a stochastic variant of primal-dual method that uses Kullback-Lieber (KL) divergence as its proximal term. We establish that the number of iterations/samples required to attain approximately optimal solutions of this scheme matches tight dependencies on the cardinality of the state and action spaces, but differs in its dependence on the infinity norm of the gradient of the risk measure. Experiments demonstrate the merits of this approach for improving the reliability of reward accumulation without additional computational burdens.

algorithm 1, distributional risk, theorem 5, (16 more...)

arXiv.org Artificial Intelligence

2002.12475

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Maryland > Prince George's County > Adelphi (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification

Calandriello, Daniele, Carratino, Luigi, Lazaric, Alessandro, Valko, Michal, Rosasco, Lorenzo

arXiv.org Machine LearningFeb-26-2020

Gaussian processes (GP) are one of the most successful frameworks to model uncertainty. However, GP optimization (e.g., GP-UCB) suffers from major scalability issues. Experimental time grows linearly with the number of evaluations, unless candidates are selected in batches (e.g., using GP-BUCB) and evaluated in parallel. Furthermore, computational cost is often prohibitive since algorithms such as GP-BUCB require a time at least quadratic in the number of dimensions and iterations to select each batch. In this paper, we introduce BBKB (Batch Budgeted Kernel Bandits), the first no-regret GP optimization algorithm that provably runs in near-linear time and selects candidates in batches. This is obtained with a new guarantee for the tracking of the posterior variances that allows BBKB to choose increasingly larger batches, improving over GP-BUCB. Moreover, we show that the same bound can be used to adaptively delay costly updates to the sparse GP approximation used by BBKB, achieving a near-constant per-step amortized cost. These findings are then confirmed in several experiments, where BBKB is much faster than state-of-the-art methods.

batch, bbkb, posterior, (15 more...)

arXiv.org Machine Learning

2002.09954

Country:

Europe > Italy (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback

State-only Imitation with Transition Dynamics Mismatch

Gangwani, Tanmay, Peng, Jian

arXiv.org Machine LearningFeb-26-2020

Imitation Learning (IL) is a popular paradigm for training agents to achieve complicated goals by leveraging expert behavior, rather than dealing with the hardships of designing a correct reward function. With the environment modeled as a Markov Decision Process (MDP), most of the existing IL algorithms are contingent on the availability of expert demonstrations in the same MDP as the one in which a new imitator policy is to be learned. This is uncharacteristic of many real-life scenarios where discrepancies between the expert and the imitator MDPs are common, especially in the transition dynamics function. Furthermore, obtaining expert actions may be costly or infeasible, making the recent trend towards state-only IL (where expert demonstrations constitute only states or observations) ever so promising. Building on recent adversarial imitation approaches that are motivated by the idea of divergence minimization, we present a new state-only IL algorithm in this paper. It divides the overall optimization objective into two subproblems by introducing an indirection step and solves the subproblems iteratively. We show that our algorithm is particularly effective when there is a transition dynamics mismatch between the expert and imitator MDPs, while the baseline IL methods suffer from performance degradation. To analyze this, we construct several interesting MDPs by modifying the configuration parameters for the MuJoCo locomotion tasks from OpenAI Gym 1 .

algorithm, demonstration, trajectory, (14 more...)

arXiv.org Machine Learning

2002.11879

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
(2 more...)

Add feedback

Optimality and Stability in Non-Convex-Non-Concave Min-Max Optimization

Zhang, Guojun, Poupart, Pascal, Yu, Yaoliang

arXiv.org Machine LearningFeb-26-2020

The existence of a saddle point in convex-concave games follows from the celebrated minimax theorem (von Neumann, 1928; Sion et al., 1958) and numerical algorithms for finding it has a long history in optimization (Dem'yanov and Malozemov, 1974; Nemirovsky and Yudin, 1983; Zhang et al., 2019; Lin et al., 2020). Recent success in generative adversarial networks (GANs) (Goodfellow et al., 2014; Heusel et al., 2017) and reinforcement learning (Sutton et al., 1998; Du et al., 2017; Dai et al., 2018) has imposed new challenges for non-convex-non-concave (NCNC) settings. An important question is: what is a local optimal point in min-max optimization? Daskalakis and Panageas (2018) used a local version of saddle points to define local optimality and studied the local convergence behavior of gradient descent (GD) (Arrow et al., 1958) and optimistic gradient descent (OGD) (Popov, 1980; Daskalakis et al., 2018). Following this work, Jin et al. (2019) proposed a new definition of local optimality called local min-max points, and studied how GD converges to such points. One of the difficulties in the NCNC settings is the absence of strong duality (e.g. Boyd and Vandenberghe (2004)), which implies an intrinsic order for min-max optimization, known as Stackelberg games (von Stackelberg, 1934) where a leader takes action first and a follower acts upon the leader's strategy. The global solution to Stackelberg games is well-defined, which we call a global min-max point (a.k.a.

local min-max point, min-max point, saddle point, (17 more...)

arXiv.org Machine Learning

2002.11875

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)

Add feedback

Efficient algorithms for multivariate shape-constrained convex regression problems

Lin, Meixia, Sun, Defeng, Toh, Kim-Chuan

arXiv.org Machine LearningFeb-26-2020

Shape-constrained convex regression problem deals with fitting a convex function to the observed data, where additional constraints are imposed, such as component-wise monotonicity and uniform Lipschitz continuity. This paper provides a comprehensive mechanism for computing the least squares estimator of a multivariate shape-constrained convex regression function in $\mathbb{R}^d$. We prove that the least squares estimator is computable via solving a constrained convex quadratic programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints, where $n$ is the number of data points. For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers ({\tt sGS-ADMM}), and the other is the proximal augmented Lagrangian method ({\tt pALM}) with the subproblems solved by the semismooth Newton method ({\tt SSN}). Comprehensive numerical experiments, including those in the pricing of basket options and estimation of production functions in economics, demonstrate that both of our proposed algorithms outperform the state-of-the-art algorithm. The {\tt pALM} is more efficient than the {\tt sGS-ADMM} but the latter has the advantage of being simpler to implement.

algorithm, convex regression, regression, (14 more...)

arXiv.org Machine Learning

2002.1141

Country:

Asia > Singapore (0.14)
Asia > China > Hong Kong (0.04)
South America > Chile (0.04)
Europe > France (0.04)

Genre: Research Report (0.50)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

Convergence to Second-Order Stationarity for Non-negative Matrix Factorization: Provably and Concurrently

Panageas, Ioannis, Skoulakis, Stratis, Varvitsiotis, Antonios, Wang, Xiao

arXiv.org Machine LearningFeb-26-2020

Non-negative matrix factorization (NMF) is a fundamental non-convex optimization problem with numerous applications in Machine Learning (music analysis, document clustering, speech-source separation etc). Despite having received extensive study, it is poorly understood whether or not there exist natural algorithms that can provably converge to a local minimum. Part of the reason is because the objective is heavily symmetric and its gradient is not Lipschitz. In this paper we define a multiplicative weight update type dynamics (modification of the seminal Lee-Seung algorithm) that runs concurrently and provably avoids saddle points (first order stationary points that are not second order). Our techniques combine tools from dynamical systems such as stability and exploit the geometry of the NMF objective by reducing the standard NMF formulation over the non-negative orthant to a new formulation over (a scaled) simplex. An important advantage of our method is the use of concurrent updates, which permits implementations in parallel computing environments.

algorithm, nonnegative matrix factorization, stationary point, (13 more...)

arXiv.org Machine Learning

2002.11323

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Complete Dictionary Learning via $\ell_p$-norm Maximization

Shen, Yifei, Xue, Ye, Zhang, Jun, Letaief, Khaled B., Lau, Vincent

arXiv.org Machine LearningFeb-26-2020

Dictionary learning is a classic representation learning method that has been widely applied in signal processing and data analytics. In this paper, we investigate a family of $\ell_p$-norm ($p>2,p \in \mathbb{N}$) maximization approaches for the complete dictionary learning problem from theoretical and algorithmic aspects. Specifically, we prove that the global maximizers of these formulations are very close to the true dictionary with high probability, even when Gaussian noise is present. Based on the generalized power method (GPM), an efficient algorithm is then developed for the $\ell_p$-based formulations. We further show the efficacy of the developed algorithm: for the population GPM algorithm over the sphere constraint, it first quickly enters the neighborhood of a global maximizer, and then converges linearly in this region. Extensive experiments will demonstrate that the $\ell_p$-based approaches enjoy a higher computational efficiency and better robustness than conventional approaches and $p=3$ performs the best.

algorithm, formulation, global maximizer, (14 more...)

arXiv.org Machine Learning

2002.10043

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Sharp Asymptotics and Optimal Performance for Inference in Binary Models

Taheri, Hossein, Pedarsani, Ramtin, Thrampoulidis, Christos

arXiv.org Machine LearningFeb-26-2020

We study convex empirical risk minimization for high-dimensional inference in binary models. Our first result sharply predicts the statistical performance of such estimators in the linear asymptotic regime under isotropic Gaussian features. Importantly, the predictions hold for a wide class of convex loss functions, which we exploit in order to prove a bound on the best achievable performance among them. Notably, we show that the proposed bound is tight for popular binary models (such as Signed, Logistic or Probit), by constructing appropriate loss functions that achieve it. More interestingly, for binary linear classification under the Logistic and Probit models, we prove that the performance of least-squares is no worse than 0.997 and 0.98 times the optimal one. Numerical simulations corroborate our theoretical findings and suggest they are accurate even for relatively small problem dimensions.

equation, loss function, optimal loss function, (16 more...)

arXiv.org Machine Learning

2002.07284

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

CAAI -- A Cognitive Architecture to Introduce Artificial Intelligence in Cyber-Physical Production Systems

Fischbach, Andreas, Strohschein, Jan, Bunte, Andreas, Stork, Jörg, Faeskorn-Woyke, Heide, Moriz, Natalia, Bartz-Beielstein, Thomas

arXiv.org Artificial IntelligenceFeb-26-2020

This paper introduces CAAI, a novel cognitive architecture for artificial intelligence in cyber-physical production systems. The goal of the architecture is to reduce the implementation effort for the usage of artificial intelligence algorithms. The core of the CAAI is a cognitive module that processes declarative goals of the user, selects suitable models and algorithms, and creates a configuration for the execution of a processing pipeline on a big data platform. Constant observation and evaluation against performance criteria assess the performance of pipelines for many and varying use cases. Based on these evaluations, the pipelines are automatically adapted if necessary. The modular design with well-defined interfaces enables the reusability and extensibility of pipeline components. A big data platform implements this modular design supported by technologies such as Docker, Kubernetes, and Kafka for virtualization and orchestration of the individual components and their communication. The implementation of the architecture is evaluated using a real-world use case.

algorithm, module, use case, (16 more...)

arXiv.org Artificial Intelligence

2003.00925

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany (0.04)
North America > United States > South Carolina > Charleston County > North Charleston (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry:

Information Technology (0.46)
Energy (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Efficient Rollout Strategies for Bayesian Optimization

Lee, Eric Hans, Eriksson, David, Cheng, Bolong, McCourt, Michael, Bindel, David

arXiv.org Artificial IntelligenceFeb-26-2020

Bayesian optimization (BO) is a class of sample-efficient global optimization methods, where a probabilistic model conditioned on previous observations is used to determine future evaluations via the optimization of an acquisition function. Most acquisition functions are myopic, meaning that they only consider the impact of the next function evaluation. Non-myopic acquisition functions consider the impact of the next $h$ function evaluations and are typically computed through rollout, in which $h$ steps of BO are simulated. These rollout acquisition functions are defined as $h$-dimensional integrals, and are expensive to compute and optimize. We show that a combination of quasi-Monte Carlo, common random numbers, and control variates significantly reduce the computational burden of rollout. We then formulate a policy-search based approach that removes the need to optimize the rollout acquisition function. Finally, we discuss the qualitative behavior of rollout policies in the setting of multi-modal objectives and model error.

acquisition function, optimization, rollout, (14 more...)

arXiv.org Artificial Intelligence

2002.10539

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback