AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

A Latent Variational Framework for Stochastic Optimization

arXiv.org Machine LearningMay-22-2019

This paper provides a unifying theoretical framework for stochastic optimization algorithms by means of a latent stochastic variational problem. Using techniques from stochastic control, the solution to the variational problem is shown to be equivalent to that of a Forward Backward Stochastic Differential Equation (FBSDE). By solving these equations, we recover a variety of existing adaptive stochastic gradient descent methods. This framework establishes a direct connection between stochastic optimization algorithms and a secondary Bayesian inference problem on gradients, where a prior measure on noisy gradient observations determine the resulting algorithm.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1905.01707

Country:

North America > Canada (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback

Efficient Profile Maximum Likelihood for Universal Symmetric Property Estimation

Charikar, Moses, Shiragur, Kirankumar, Sidford, Aaron

arXiv.org Machine LearningMay-21-2019

Estimating symmetric properties of a distribution, e.g. support size, coverage, entropy, distance to uniformity, are among the most fundamental problems in algorithmic statistics. While each of these properties have been studied extensively and separate optimal estimators are known for each, in striking recent work, Acharya et al. 2016 showed that there is a single estimator that is competitive for all symmetric properties. This work proved that computing the distribution that approximately maximizes \emph{profile likelihood (PML)}, i.e. the probability of observed frequency of frequencies, and returning the value of the property on this distribution is sample competitive with respect to a broad class of estimators of symmetric properties. Further, they showed that even computing an approximation of the PML suffices to achieve such a universal plug-in estimator. Unfortunately, prior to this work there was no known polynomial time algorithm to compute an approximate PML and it was open to obtain a polynomial time universal plug-in estimator through the use of approximate PML. In this paper we provide a algorithm (in number of samples) that, given $n$ samples from a distribution, computes an approximate PML distribution up to a multiplicative error of $\exp(n^{2/3} \mathrm{poly} \log(n))$ in time nearly linear in $n$. Generalizing work of Acharya et al. 2016 on the utility of approximate PML we show that our algorithm provides a nearly linear time universal plug-in estimator for all symmetric functions up to accuracy $\epsilon = \Omega(n^{-0.166})$. Further, we show how to extend our work to provide efficient polynomial-time algorithms for computing a $d$-dimensional generalization of PML (for constant $d$) that allows for universal plug-in estimation of symmetric relationships between distributions.

artificial intelligence, exp, machine learning, (17 more...)

arXiv.org Machine Learning

1905.08448

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Alberta > Census Division No. 8 > Red Deer County (0.04)
(4 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.41)

Add feedback

Data-driven preference learning methods for value-driven multiple criteria sorting with interacting criteria

Liu, Jiapeng, Kadzinski, Milosz, Liao, Xiuwu, Mao, Xiaoxin

arXiv.org Machine LearningMay-21-2019

The learning of predictive models for data-driven decision support has been a prevalent topic in many fields. However, construction of models that would capture interactions among input variables is a challenging task. In this paper, we present a new preference learning approach for multiple criteria sorting with potentially interacting criteria. It employs an additive piecewise-linear value function as the basic preference model, which is augmented with components for handling the interactions. To construct such a model from a given set of assignment examples concerning reference alternatives, we develop a convex quadratic programming model. Since its complexity does not depend on the number of training samples, the proposed approach is capable for dealing with data-intensive tasks. To improve the generalization of the constructed model on new instances and to overcome the problem of over-fitting, we employ the regularization techniques. We also propose a few novel methods for classifying non-reference alternatives in order to enhance the applicability of our approach to different datasets. The practical usefulness of the proposed method is demonstrated on a problem of parametric evaluation of research units, whereas its predictive performance is studied on several monotone learning datasets. The experimental results indicate that our approach compares favourably with the classical UTADIS method and the Choquet integral-based sorting model.

artificial intelligence, criteria, machine learning, (20 more...)

arXiv.org Machine Learning

1905.08506

Country:

Europe > Poland > Greater Poland Province > Poznań (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
Oceania > New Zealand > North Island > Waikato (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Things You May Not Know About Adversarial Example: A Black-box Adversarial Image Attack

Duan, Yuchao, Zhao, Zhe, Bu, Lei, Song, Fu

arXiv.org Machine LearningMay-21-2019

Numerous methods for crafting adversarial examples were proposed recently with high success rate. Most existing works normalize images into a continuous, real vector, domain firstly, and then craft adversarial examples in this domain. However, "adversarial" examples may become benign after de-normalizing them back into the discrete integer domain, known as the discretization problem. The discretization problem was mentioned in some work, but was underestimated and has received relatively little attention. In this work, we conduct the first comprehensive study of the discretization problem. We theoretically analyze 34 representative methods and empirically study 20 representative open source tools for crafting adversarial images. Our study reveals that almost all existing works suffer from the discretization problem and it is far more serious than originally thought. For instance, most black-box methods downgrade to white-box ones and methods having higher success rates drop down to lower high success rates, e.g., from 100% to 10%. This suggests that the discretization problem should be taken into account when crafting adversarial examples. As a first step towards addressing this problem, we propose a black-box method which reduces the adversarial example searching problem to a derivative-free optimization problem. Our method is able to craft `real' adversarial images by derivative-free search on the discrete integer domain. Experimental results show that our method achieves significantly higher success rate in terms of adversarial examples in the discrete integer domain than most other methods, no matter white-box or black-box. Moreover, our method is able to handle models that is non-differentiable and we successfully break the winner of NIPS 17 competition on defense with 95% success rate.

adversarial example, discretization problem, proceedings, (15 more...)

arXiv.org Machine Learning

1905.07672

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
North America > United States > Texas (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Air (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(3 more...)

Add feedback

Multi-view Locality Low-rank Embedding for Dimension Reduction

Feng, Lin, Meng, Xiangzhu, Wang, Huibing

arXiv.org Machine LearningMay-20-2019

During the last decades, we have witnessed a surge of interests of learning a low-dimensional space with discriminative information from one single view. Even though most of them can achieve satisfactory performance in some certain situations, they fail to fully consider the information from multiple views which are highly relevant but sometimes look different from each other. Besides, correlations between features from multiple views always vary greatly, which challenges multi-view subspace learning. Therefore, how to learn an appropriate subspace which can maintain valuable information from multi-view features is of vital importance but challenging. To tackle this problem, this paper proposes a novel multi-view dimension reduction method named Multi-view Locality Low-rank Embedding for Dimension Reduction (MvL2E). MvL2E makes full use of correlations between multi-view features by adopting low-rank representations. Meanwhile, it aims to maintain the correlations and construct a suitable manifold space to capture the low-dimensional embedding for multi-view features. A centroid based scheme is designed to force multiple views to learn from each other. And an iterative alternating strategy is developed to obtain the optimal solution of MvL2E. The proposed method is evaluated on 5 benchmark datasets. Comprehensive experiments show that our proposed MvL2E can achieve comparable performance with previous approaches proposed in recent literatures.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

1905.08138

Country:

Asia > China > Liaoning Province > Dalian (0.06)
Oceania > Australia > South Australia > Adelaide (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.81)

Add feedback

Gaussian Process Learning via Fisher Scoring of Vecchia's Approximation

Guinness, Joseph

arXiv.org Machine LearningMay-20-2019

The Gaussian process model is an indispensible tool for the analysis of spatial and spatial-temporal datasets and has become increasingly popular as a general-purpose model for functions. Because of its high computational burden, researchers have devoted substantial effort to developing numerical approximations for Gaussian process computations. Much of the work focuses on efficient approximation of the likelihood function. Fast likelihood evaluations are crucial for optimization procedures that require many evaluations of the likelihood, such as the default Nelder-Mead algorithm (Nelder and Mead, 1965) in the R optim function. The likelihood must be repeatedly evaluated in MCMC algorithms as well. Compared to the amount of literature on efficient likelihood approximations, there has been considerably less development of techniques for numerically maximizing the likelihood (see Geoga et al. (2018) for one recent example). This article aims to address the disparity by providing: 1. Formulas for evaluating the gradient and Fisher information for Vecchia's likelihood approximation in a single pass through the data, so that the Fisher scoring algorithm can be applied. Fisher scoring is a modification of the Newton-Raphson optimization method, replacing the Hessian matrix with the Fisher information matrix.

artificial intelligence, loglikelihood, optimization problem, (16 more...)

arXiv.org Machine Learning

1905.08374

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

MaxEntropy Pursuit Variational Inference

Egorov, Evgenii, Neklydov, Kirill, Kostoev, Ruslan, Burnaev, Evgeny

arXiv.org Machine LearningMay-19-2019

One of the core problems in variational inference is a choice of approximate posterior distribution. It is crucial to trade-off between efficient inference with simple families as mean-field models and accuracy of inference. We propose a variant of a greedy approximation of the posterior distribution with tractable base learners. Using Max-Entropy approach, we obtain a well-defined optimization problem. We demonstrate the ability of the method to capture complex multimodal posterior via continual learning setting for neural networks.

artificial intelligence, machine learning, optimization problem, (14 more...)

arXiv.org Machine Learning

1905.07855

Country: Europe > Russia (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Practical Bayesian Optimization with Threshold-Guided Marginal Likelihood Maximization

Kim, Jungtaek, Choi, Seungjin

arXiv.org Machine LearningMay-18-2019

We propose a practical Bayesian optimization method, of which the surrogate function is Gaussian process regression with threshold-guided marginal likelihood maximization. Because Bayesian optimization consumes much time in finding optimal free parameters of Gaussian process regression, mitigating a time complexity of this step is critical to speed up Bayesian optimization. For this reason, we propose a simple, but straightforward Bayesian optimization method, assuming a reasonable condition, which is observed in many practical examples. Our experimental results confirm that our method is effective to reduce the execution time. All implementations are available in our repository.

artificial intelligence, l-bfg-b, machine learning, (15 more...)

arXiv.org Machine Learning

1905.0754

Country:

Europe (0.94)
North America > United States (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

Approximation of the objective insensitivity regions using Hierarchic Memetic Strategy coupled with Covariance Matrix Adaptation Evolutionary Strategy

Sawicki, Jakub, Smołka, Maciej, Łoś, Marcin, Schaefer, Robert

arXiv.org Artificial IntelligenceMay-17-2019

One of the most challenging types of ill-posedness in global optimization is the presence of insensitivity regions in design parameter space, so the identification of their shape will be crucial, if ill-posedness is irrecoverable. Such problems may be solved using global stochastic search followed by post-processing of a local sample and a local objective approximation. We propose a new approach of this type composed of Hierarchic Memetic Strategy (HMS) powered by the Covariance Matrix Adaptation Evolutionary Strategy (CMA-ES) well-known as an effective, self-adaptable stochastic optimization algorithm and we leverage the distribution density knowledge it accumulates to better identify and separate insensitivity regions. The results of benchmarks prove that the improved HMS-CMA-ES strategy is effective in both the total computational cost and the accuracy of insensitivity region approximation. The reference data for the tests was obtained by means of a well-known effective strategy of multimodal stochastic optimization called the Niching Evolutionary Algorithm 2 (NEA2), that also uses CMA-ES as a component.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1905.07288

Country: Europe > Poland (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

MOBA: A multi-objective bounded-abstention model for two-class cost-sensitive problems

Guan, Hongjiao

arXiv.org Machine LearningMay-17-2019

Abstaining classifiers have been widely used in cost-sensitive applications to avoid ambiguous classification and reduce the cost of misclassification. Previous abstaining classification models rely on cost information, such as a cost matrix or cost ratio. However, it is difficult to obtain or estimate costs in practical applications. Furthermore, these abstention models are typically restricted to a single optimization metric, which may not be the expected indicator when evaluating classification performance. To overcome such problems, a multi-objective bounded-abstention (MOBA) model is proposed to optimize essential metrics. Specifically, the MOBA model minimizes the error rate of each class under class-dependent abstention constraints. The MOBA model is then solved using the non-dominated sorting genetic algorithm II, which is a popular evolutionary multi-objective optimization algorithm. A set of Pareto-optimal solutions will be generated and the best one can be selected according to provided conditions (whether costs are known) or performance demands (e.g., obtaining a high accuracy, F-measure, and etc). Hence, the MOBA model is robust towards variations in the conditions and requirements. Compared to state-of-the-art abstention models, MOBA achieves lower expected costs when cost information is considered, and better performance-abstention trade-offs when it is not.

evolutionary algorithm, machine learning, moba model, (18 more...)

arXiv.org Machine Learning

1905.07297

Country: Asia (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback