AITopics

Black-box optimization is primarily important for many compute-intensive applications, including reinforcement learning (RL), robot control, etc. This paper presents a novel theoretical framework for black-box optimization, in which our method performs stochastic update within a trust region defined with KL-divergence. We show that this update is equivalent to a natural gradient step w.r.t. natural parameters of an exponential-family distribution. Theoretically, we prove the convergence rate of our framework for convex functions. Our theoretical results also hold for non-differentiable black-box functions. Empirically, our method achieves superior performance compared with the state-of-the-art method CMA-ES on separable benchmark test problems.

artificial intelligence, machine learning, tnull 2, (14 more...)

1910.04301

Genre: Research Report (0.84)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

One Sample Stochastic Frank-Wolfe

Zhang, Mingrui, Shen, Zebang, Mokhtari, Aryan, Hassani, Hamed, Karbasi, Amin

One of the beauties of the projected gradient descent method lies in its rather simple mechanism and yet stable behavior with inexact, stochastic gradients, which has led to its wide-spread use in many machine learning applications. However, once we replace the projection operator with a simpler linear program, as is done in the Frank-Wolfe method, both simplicity and stability take a serious hit. The aim of this paper is to bring them back without sacrificing the efficiency. In this paper, we propose the first one-sample stochastic Frank-Wolfe algorithm, called 1-SFW, that avoids the need to carefully tune the batch size, step size, learning rate, and other complicated hyper parameters. In particular, 1-SFW achieves the optimal convergence rate of $\mathcal{O}(1/\epsilon^2)$ for reaching an $\epsilon$-suboptimal solution in the stochastic convex setting, and a $(1-1/e)-\epsilon$ approximate solution for a stochastic monotone DR-submodular maximization problem. Moreover, in a general non-convex setting, 1-SFW finds an $\epsilon$-first-order stationary point after at most $\mathcal{O}(1/\epsilon^3)$ iterations, achieving the current best known convergence rate. All of this is possible by designing a novel unbiased momentum estimator that governs the stability of the optimization process while using a single sample at each iteration.

algorithm, optimization, unbiased estimator, (15 more...)

1910.04322

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.54)

Chen, Xi, Krishnamurthy, Akshay, Wang, Yining

Robust Dynamic Assortment Optimization in the Presence of Outlier Customers

We consider the dynamic assortment optimization problem under the multinomial logit model (MNL) with unknown utility parameters. The main question investigated in this paper is model mis-specification under the $\varepsilon$-contamination model, which is a fundamental model in robust statistics and machine learning. In particular, throughout a selling horizon of length $T$, we assume that customers make purchases according to a well specified underlying multinomial logit choice model in a ($1-\varepsilon$)-fraction of the time periods, and make arbitrary purchasing decisions instead in the remaining $\varepsilon$-fraction of the time periods. In this model, we develop a new robust online assortment optimization policy via an active elimination strategy. We establish both upper and lower bounds on the regret, and show that our policy is optimal up to logarithmic factor in T when the assortment capacity is constant. Furthermore, we develop a fully adaptive policy that does not require any prior knowledge of the contamination parameter $\varepsilon$. Our simulation study shows that our policy outperforms the existing policies based on upper confidence bounds (UCB) and Thompson sampling.

customer, nullv, time period, (16 more...)

1910.04183

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Nabi, Razieh, Malinsky, Daniel, Shpitser, Ilya

Optimal Training of Fair Predictive Models

Recently there has been sustained interest in modifying prediction algorithms to satisfy fairness constraints. These constraints are typically complex nonlinear functionals of the observed data distribution. Focusing on the causal constraints proposed by Nabi and Shpitser (2018), we introduce new theoretical results and optimization techniques to make model training easier and more accurate. Specifically, we show how to reparameterize the observed data likelihood such that fairness constraints correspond directly to parameters that appear in the likelihood, transforming a complex constrained optimization objective into a simple optimization problem with box constraints. We also exploit methods from empirical likelihood theory in statistics to improve predictive performance, without requiring parametric models for high-dimensional feature vectors.

likelihood, optimization problem, shpitser, (17 more...)

1910.04109

Country:

North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Buathong, Poompol, Ginsbourger, David, Krityakierne, Tipaluck

Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization

We focus on kernel methods for set-valued inputs and their application to Bayesian set optimization, notably combinatorial optimization. We introduce a class of (strictly) positive definite kernels that relies on Reproducing Kernel Hilbert Space embeddings, and successfully generalizes "double sum" set kernels recently considered in Bayesian set optimization, which turn out to be unsuitable for combinatorial optimization. The proposed class of kernels, for which we provide theoretical guarantees, essentially consists in applying an outer kernel on top of the canonical distance induced by a double sum kernel. Proofs of theoretical results about considered kernels are complemented by a few practicalities regarding hyperparameter fitting. We furthermore demonstrate the applicability of our approach in prediction and optimization tasks, relying both on toy examples and on two test cases from mechanical engineering and hydrogeology, respectively. Experimental results illustrate the added value of the approach and open new perspectives in prediction and sequential design with set inputs.

double sum kernel, kernel, optimization, (13 more...)

1910.04086

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Bern > Bern (0.04)
Europe > France (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

Tomczak, Marcin B., de Cote, Enrique Munoz, Macua, Sergio Valcarcel, Vrancx, Peter

Compatible features for Monotonic Policy Improvement

arXiv.org Artificial IntelligenceOct-9-2019

Recent policy optimization approaches have achieved substantial empirical success by constructing surrogate optimization objectives. The Approximate Policy Iteration objective (Schulman et al., 2015a; Kakade and Langford, 2002) has become a standard optimization target for reinforcement learning problems. Using this objective in practice requires an estimator of the advantage function. Policy optimization methods such as those proposed in Schulman et al. (2015b) estimate the advantages using a parametric critic. In this work we establish conditions under which the parametric approximation of the critic does not introduce bias to the updates of surrogate objective. These results hold for a general class of parametric policies, including deep neural networks. We obtain a result analogous to the compatible features derived for the original Policy Gradient Theorem (Sutton et al., 1999). As a result, we also identify a previously unknown bias that current state-of-the-art policy optimization algorithms (Schulman et al., 2015a, 2017) have introduced by not employing these compatible features.

compatible feature, gradient, international conference, (14 more...)

arXiv.org Artificial Intelligence

1910.0388

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Tomczak, Marcin B., Kim, Dongho, Vrancx, Peter, Kim, Kee-Eung

Policy Optimization Through Approximated Importance Sampling

arXiv.org Artificial IntelligenceOct-9-2019

Recent policy optimization approaches (Schulman et al., 2015a, 2017) have achieved substantial empirical successes by constructing new proxy optimization objectives. These proxy objectives allow stable and low variance policy learning, but require small policy updates to ensure that the proxy objective remains an accurate approximation of the target policy value. In this paper we derive an alternative objective that obtains the value of the target policy by applying importance sampling. This objective can be directly estimated from samples, as it takes an expectation over trajectories generated by the current policy. However, the basic importance sampled objective is not suitable for policy optimization, as it incurs unacceptable variance. We therefore introduce an approximation that allows us to directly trade-off the bias of approximation with the variance in policy updates. We show that our approximation unifies the proxy optimization approaches with the importance sampling objective and allows us to interpolate between them. We then provide a theoretical analysis of the method that directly quantifies the error term due to the approximation. Finally, we obtain a practical algorithm by optimizing the introduced objective with proximal policy optimization techniques (Schulman etal., 2017). We empirically demonstrate that the result-ing algorithm yields superior performance on continuous control benchmarks

approximation, objective, variance, (15 more...)

arXiv.org Artificial Intelligence

1910.03857

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kalita, Himangshu, Thangavelautham, Jekan

Automated Multidisciplinary Design and Control of Hopping Robots for Exploration of Extreme Environments on the Moon and Mars

arXiv.org Artificial IntelligenceOct-9-2019

The next frontier in solar system exploration will be missions targeting extreme and rugged environments such as caves, canyons, cliffs and crater rims of the Moon, Mars and icy moons. These environments are time capsules into early formation of the solar system and will provide vital clues of how our early solar system gave way to the current planets and moons. These sites will also provide vital clues to the past and present habitability of these environments. Current landers and rovers are unable to access these areas of high interest due to limitations in precision landing techniques, need for large and sophisticated science instruments and a mission assurance and operations culture where risks are minimized at all costs. Our past work has shown the advantages of using multiple spherical hopping robots called SphereX for exploring these extreme environments. Our previous work was based on performing exploration with a human-designed baseline design of a SphereX robot. However, the design of SphereX is a complex task that involves a large number of design variables and multiple engineering disciplines. In this work we propose to use Automated Multidisciplinary Design and Control Optimization (AMDCO) techniques to find near optimal design solutions in terms of mass, volume, power, and control for SphereX for different mission scenarios.

constraint, power system, robot, (13 more...)

arXiv.org Artificial Intelligence

1910.03827

Country:

North America > United States > Arizona > Pima County > Tucson (0.04)
North America > United States > New York (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.50)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

#artificialintelligenceOct-8-2019, 21:11:47 GMT

How to Implement Bayesian Optimization from Scratch in Python

Many methods exist for function optimization, such as randomly sampling the variable search space, called random search, or systematically evaluating samples in a grid across the search space, called grid search. More principled methods are able to learn from sampling the space so that future samples are directed toward the parts of the search space that are most likely to contain the extrema. A directed approach to global optimization that uses probability is called Bayesian Optimization. Take my free 7-day email crash course now (with sample code). Click to sign-up and also get a free PDF Ebook version of the course.

bayesian optimization, objective function, surrogate function, (15 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

#artificialintelligenceOct-8-2019, 07:17:04 GMT

How It Feels to Learn Data Science in 2019

So I just have to buy a Tableau license and I'm now a data scientist? Okay, let's just take that sales pitch with a grain of salt. I may be clueless, but I know there is more to data science than making pretty visualizations. I can do that in Excel. You got to admit it is slick marketing though. Charting data is the fun stage, and they leave out the painful and time-consuming parts of working with data: cleaning, wrangling, transforming, and loading it. Yes, and that is why I suspect there is value in learning to code. Maybe you can learn Alteryx. There's another software called Alteryx that allows you to clean, wrangle, transform, and load data.

data science, machine learning, regression, (11 more...)

#artificialintelligence

Country: North America > United States > California (0.04)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
(2 more...)