AITopics

2006.05691

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

MIT Technology ReviewJun-9-2020, 14:56:20 GMT

Quantum computing: A key ally for meeting business objectives

In the business world, the opportunities for applying quantum technology relate to optimization: solving difficult business problems, reconfiguring complex processes, and understanding correlations between seemingly disparate data sets. The main purpose of quantum computing is to carry out computationally costly operations in a very short period of time, while at the same time accelerating business performance. Quantum computing can optimize business processes for any number of solutions, for example maximizing cost/benefit ratios or optimizing financial assets, operations and logistics, and workforce management--usually delivering immediate financial gains. Many businesses are already using (or planning to use) classic optimization algorithms. And with four international case studies, Reply has proven that a quantum approach can give better results than existing optimization techniques. Speed and computational power are key components when working with data.

artificial intelligence, computing, optimization problem, (13 more...)

MIT Technology Review

AI-Alerts: 2020 > 2020-06 > AAAI AI-Alert for Jun 10, 2020 (1.00)

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)

Industry: Transportation > Ground > Rail (0.33)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Capone, Alexandre, Umlauft, Jonas, Beckers, Thomas, Lederer, Armin, Hirche, Sandra

Localized active learning of Gaussian process state space models

The performance of learning-based control techniques crucially depends on how effectively the system is explored. While most exploration techniques aim to achieve a globally accurate model, such approaches are generally unsuited for systems with unbounded state spaces. Furthermore, a globally accurate model is not required to achieve good performance in many common control applications, e.g., local stabilization tasks. In this paper, we propose an active learning strategy for Gaussian process state space models that aims to obtain an accurate model on a bounded subset of the state-action space. Our approach aims to maximize the mutual information of the exploration trajectories with respect to a discretization of the region of interest. By employing model predictive control, the proposed technique integrates information collected during exploration and adaptively improves its exploration strategy. To enable computational tractability, we decouple the choice of most informative data points from the model predictive control optimization step. This yields two optimization problems that can be solved in parallel. We apply the proposed method to explore the state space of various dynamical systems and compare our approach to a commonly used entropy-based exploration strategy. In all experiments, our method yields a better model within the region of interest than the entropy-based method.

gaussian process, optimization problem, upstream oil & gas, (19 more...)

2005.02191

Country:

Europe > Germany (0.14)
North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Tomar, Manan, Shani, Lior, Efroni, Yonathan, Ghavamzadeh, Mohammad

Mirror Descent Policy Optimization

arXiv.org Artificial IntelligenceJun-9-2020

We propose deep Reinforcement Learning (RL) algorithms inspired by mirror descent, a well-known first-order trust region optimization method for solving constrained convex problems. Our approach, which we call as Mirror Descent Policy Optimization (MDPO), is based on the idea of iteratively solving a `trust-region' problem that minimizes a sum of two terms: a linearization of the objective function and a proximity term that restricts two consecutive updates to be close to each other. Following this approach we derive on-policy and off-policy variants of the MDPO algorithm and analyze their performance while emphasizing important implementation details, motivated by the existing theoretical framework. We highlight the connections between on-policy MDPO and two popular trust region RL algorithms: TRPO and PPO, and conduct a comprehensive empirical comparison of these algorithms. We then derive off-policy MDPO and compare its performance to existing approaches. Importantly, we show that the theoretical framework of MDPO can be scaled to deep RL while achieving good performance on popular benchmarks.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2005.09814

Country:

Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Distributional Robust Batch Contextual Bandits

Si, Nian, Zhang, Fan, Zhou, Zhengyuan, Blanchet, Jose

Policy learning using historical observational data is an important problem that has found widespread applications. Examples include selecting offers, prices, advertisements to send to customers, as well as selecting which medication to prescribe to a patient. However, existing literature rests on the crucial assumption that the future environment where the learned policy will be deployed is the same as the past environment that has generated the data--an assumption that is often false or too coarse an approximation. In this paper, we lift this assumption and aim to learn a distributional robust policy with incomplete (bandit) observational data. We propose a novel learning algorithm that is able to learn a robust policy to adversarial perturbations and unknown covariate shifts. We first present a policy evaluation procedure in the ambiguous environment and then give a performance guarantee based on the theory of uniform convergence. Additionally, we also give a heuristic algorithm to solve the distributional robust policy learning problems efficiently.

artificial intelligence, machine learning, optimization problem, (14 more...)

2006.0563

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report (1.00)

Industry:

Marketing (0.66)
Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Chamon, Luiz F. O., Ribeiro, Alejandro

Probably Approximately Correct Constrained Learning

As learning solutions reach critical applications in social, industrial, and medical domains, the need to curtail their behavior becomes paramount. There is now ample evidence that without explicit tailoring, learning can lead to biased, unsafe, and prejudiced solutions. To tackle these problems, we develop a generalization theory of constrained learning based on the probably approximately correct (PAC) learning framework. In particular, we show that imposing requirements does not make a learning problem harder in the sense that any PAC learnable class is also PAC constrained learnable using a constrained counterpart of the empirical risk minimization (ERM) rule. For typical parametrized models, however, this learner involves solving a non-convex optimization program for which even obtaining a feasible solution may be hard. To overcome this issue, we prove that under mild conditions the empirical dual problem of constrained learning is also a PAC constrained learner that now leads to a practical constrained learning algorithm. We analyze the generalization properties of this solution and use it to illustrate how constrained learning can address problems in fair and robust classification.

artificial intelligence, constraint, machine learning, (16 more...)

2006.05487

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(31 more...)

Genre: Research Report (0.40)

Industry:

Education (1.00)
Government (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Perrone, Valerio, Donini, Michele, Kenthapadi, Krishnaram, Archambeau, Cédric

Fair Bayesian Optimization

Given the increasing importance of machine learning (ML) in our lives, algorithmic fairness techniques have been proposed to mitigate biases that can be amplified by ML. Commonly, these specialized techniques apply to a single family of ML models and a specific definition of fairness, limiting their effectiveness in practice. We introduce a general constrained Bayesian optimization (BO) framework to optimize the performance of any ML model while enforcing one or multiple fairness constraints. BO is a global optimization method that has been successfully applied to automatically tune the hyperparameters of ML models. We apply BO with fairness constraints to a range of popular models, including random forests, gradient boosting, and neural networks, showing that we can obtain accurate and fair solutions by acting solely on the hyperparameters. We also show empirically that our approach is competitive with specialized techniques that explicitly enforce fairness constraints during training, and outperforms preprocessing methods that learn unbiased representations of the input data. Moreover, our method can be used in synergy with such specialized fairness techniques to tune their hyperparameters. Finally, we study the relationship between hyperparameters and fairness of the generated model. We observe a correlation between regularization and unbiased models, explaining why acting on the hyperparameters leads to ML models that generalize well and are fair.

artificial intelligence, machine learning, optimization problem, (16 more...)

2006.05109

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Golovin, Daniel, Zhang, Qiuyi

Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization

Single-objective black box optimization (also known as zeroth-order optimization) is the process of minimizing a scalar objective $f(x)$, given evaluations at adaptively chosen inputs $x$. In this paper, we consider multi-objective optimization, where $f(x)$ outputs a vector of possibly competing objectives and the goal is to converge to the Pareto frontier. Quantitatively, we wish to maximize the standard hypervolume indicator metric, which measures the dominated hypervolume of the entire set of chosen inputs. In this paper, we introduce a novel scalarization function, which we term the hypervolume scalarization, and show that drawing random scalarizations from an appropriately chosen distribution can be used to efficiently approximate the hypervolume indicator metric. We utilize this connection to show that Bayesian optimization with our scalarization via common acquisition functions, such as Thompson Sampling or Upper Confidence Bound, provably converges to the whole Pareto frontier by deriving tight hypervolume regret bounds on the order of $\widetilde{O}(\sqrt{T})$. Furthermore, we highlight the general utility of our scalarization framework by showing that any provably convergent single-objective optimization process can be effortlessly converted to a multi-objective optimization process with provable convergence guarantees.

artificial intelligence, machine learning, scalarization, (14 more...)

2006.04655

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningJun-8-2020

CLAIMED: A CLAssification-Incorporated Minimum Energy Design to explore a multivariate response surface with feasibility constraints

Song, Yao, Sengul, Mert Y., He, Linglin, van Duin, Adri C. T., Hung, Ying, Dasgupta, Tirthankar

Motivated by the problem of optimization of force-field systems in physics using large-scale computer simulations, we consider exploration of a deterministic complex multivariate response surface. The objective is to find input combinations that generate output close to some desired or "target" vector. In spite of reducing the problem to exploration of the input space with respect to a one-dimensional loss function, the search is nontrivial and challenging due to infeasible input combinations, high dimensionalities of the input and output space and multiple "desirable" regions in the input space and the difficulty of emulating the objective function well with a surrogate model. We propose an approach that is based on combining machine learning techniques with smart experimental design ideas to locate multiple good regions in the input space.

algorithm, artificial intelligence, machine learning, (18 more...)

2006.05021

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Park, Soon-Seo, Park, Young-Jin, Min, Youngjae, Choi, Han-Lim

Online Gaussian Process State-Space Model: Learning and Planning for Partially Observable Dynamical Systems

arXiv.org Artificial IntelligenceJun-8-2020

This paper proposes an online learning method of Gaussian process state-space model (GP-SSM). GP-SSM is a probabilistic representation learning scheme that represents unknown state transition and/or measurement models as Gaussian processes (GPs). While the majority of prior literature on learning of GP-SSM are focused on processing a given set of time series data, data may arrive and accumulate sequentially over time in most dynamical systems. Storing all such sequential data and updating the model over entire data incur large amount of computational resources in space and time. To overcome this difficulty, we propose a practical method, termed \textit{onlineGPSSM}, that incorporates stochastic variational inference (VI) and online VI with novel formulation. The proposed method mitigates the computational complexity without catastrophic forgetting and also support adaptation to changes in a system and/or a real environments. Furthermore, we present application of onlineGPSSM into the reinforcement learning (RL) of partially observable dynamical systems by integrating onlineGPSSM with Bayesian filtering and trajectory optimization algorithms. Numerical examples are presented to demonstrate applicability of the proposed method.

artificial intelligence, machine learning, posterior, (19 more...)

arXiv.org Artificial Intelligence

1903.08643

Country:

North America > United States > Massachusetts (0.04)
Asia > Middle East > Jordan (0.04)
Asia > India (0.04)

Genre: Research Report (0.50)

Industry:

Energy (0.46)
Education (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)