AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Continuous DR-submodular Maximization: Structure and Algorithms

Bian, An, Levy, Kfir, Krause, Andreas, Buhmann, Joachim M.

Neural Information Processing SystemsDec-31-2017

DR-submodular continuous functions are important objectives with wide real-world applications spanning MAP inference in determinantal point processes (DPPs), and mean-field inference for probabilistic submodular models, amongst others. DR-submodularity captures a subclass of non-convex functions that enables both exact minimization and approximate maximization in polynomial time. In this work we study the problem of maximizing non-monotone DR-submodular continuous functions under general down-closed convex constraints. We start by investigating geometric properties that underlie such objectives, e.g., a strong relation between (approximately) stationary points and global optimum is proved. These properties are then used to devise two optimization algorithms with provable guarantees. Concretely, we first devise a "two-phase'' algorithm with 1/4 approximation guarantee. This algorithm allows the use of existing methods for finding (approximately) stationary points as a subroutine, thus, harnessing recent progress in non-convex optimization. Then we present a non-monotone Frank-Wolfe variant with 1/e approximation guarantee and sublinear convergence rate. Finally, we extend our approach to a broader class of generalized DR-submodular continuous functions, which captures a wider spectrum of applications. Our theoretical findings are validated on synthetic and real-world problem instances.

algorithm, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

k-Support and Ordered Weighted Sparsity for Overlapping Groups: Hardness and Algorithms

Lim, Cong Han, Wright, Stephen

Neural Information Processing SystemsDec-31-2017

The k-support and OWL norms generalize the l1 norm, providing better prediction accuracy and better handling of correlated variables. We study the norms obtained from extending the k-support norm and OWL norms to the setting in which there are overlapping groups. The resulting norms are in general NP-hard to compute, but they are tractable for certain collections of groups. To demonstrate this fact, we develop a dynamic program for the problem of projecting onto the set of vectors supported by a fixed number of groups. Our dynamic program utilizes tree decompositions and its complexity scales with the treewidth. This program can be converted to an extended formulation which, for the associated group structure, models the k-group support norms and an overlapping group variant of the ordered weighted l1 norm. Numerical results demonstrate the efficacy of the new penalties.

artificial intelligence, machine learning, optimization problem, (16 more...)

Neural Information Processing Systems

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Parametric Simplex Method for Sparse Learning

Pang, Haotian, Liu, Han, Vanderbei, Robert J., Zhao, Tuo

Neural Information Processing SystemsDec-31-2017

High dimensional sparse learning has imposed a great computational challenge to large scale data analysis. In this paper, we investiage a broad class of sparse learning approaches formulated as linear programs parametrized by a {\em regularization factor}, and solve them by the parametric simplex method (PSM). PSM offers significant advantages over other competing methods: (1) PSM naturally obtains the complete solution path for all values of the regularization parameter; (2) PSM provides a high precision dual certificate stopping criterion; (3) PSM yields sparse solutions through very few iterations, and the solution sparsity significantly reduces the computational cost per iteration. Particularly, we demonstrate the superiority of PSM over various sparse learning approaches, including Dantzig selector for sparse linear regression, sparse support vector machine for sparse linear classification, and sparse differential network estimation. We then provide sufficient conditions under which PSM always outputs sparse solutions such that its computational performance can be significantly boosted. Thorough numerical experiments are provided to demonstrate the outstanding performance of the PSM method.

artificial intelligence, machine learning, simplex method, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.68)

Add feedback

Dynamic Pricing in High-dimensions

Javanmard, Adel, Nazerzadeh, Hamid

arXiv.org Machine LearningDec-31-2017

We study the pricing problem faced by a firm that sells a large number of products, described via a wide range of features, to customers that arrive over time. Customers independently make purchasing decisions according to a general choice model that includes products features and customers' characteristics, encoded as $d$-dimensional numerical vectors, as well as the price offered. The parameters of the choice model are a priori unknown to the firm, but can be learned as the (binary-valued) sales data accrues over time. The firm's objective is to minimize the regret, i.e., the expected revenue loss against a clairvoyant policy that knows the parameters of the choice model in advance, and always offers the revenue-maximizing price. This setting is motivated in part by the prevalence of online marketplaces that allow for real-time pricing. We assume a structured choice model, parameters of which depend on $s_0$ out of the $d$ product features. We propose a dynamic policy, called Regularized Maximum Likelihood Pricing (RMLP) that leverages the (sparsity) structure of the high-dimensional model and obtains a logarithmic regret in $T$. More specifically, the regret of our algorithm is of $O(s_0 \log d \cdot \log T)$. Furthermore, we show that no policy can obtain regret better than $O(s_0 (\log d + \log T))$.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Machine Learning

1609.07574

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Marketing (1.00)
Information Technology > Services (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(2 more...)

Add feedback

Modular proximal optimization for multidimensional total-variation regularization

Barbero, Álvaro, Sra, Suvrit

arXiv.org Machine LearningDec-30-2017

We study \emph{TV regularization}, a widely used technique for eliciting structured sparsity. In particular, we propose efficient algorithms for computing prox-operators for $\ell_p$-norm TV. The most important among these is $\ell_1$-norm TV, for whose prox-operator we present a new geometric analysis which unveils a hitherto unknown connection to taut-string methods. This connection turns out to be remarkably useful as it shows how our geometry guided implementation results in efficient weighted and unweighted 1D-TV solvers, surpassing state-of-the-art methods. Our 1D-TV solvers provide the backbone for building more complex (two or higher-dimensional) TV solvers within a modular proximal optimization approach. We review the literature for an array of methods exploiting this strategy, and illustrate the benefits of our modular design through extensive suite of experiments on (i) image denoising, (ii) image deconvolution, (iii) four variants of fused-lasso, and (iv) video denoising. To underscore our claims and permit easy reproducibility, we provide all the reviewed and our new TV solvers in an easy to use multi-threaded C++, Matlab and Python library.

artificial intelligence, machine learning, programming language, (20 more...)

arXiv.org Machine Learning

1411.0589

Country: North America > United States > Massachusetts (0.27)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(3 more...)

Add feedback

Data-Driven Stochastic Robust Optimization: A General Computational Framework and Algorithm for Optimization under Uncertainty in the Big Data Era

Ning, Chao, You, Fengqi

arXiv.org Artificial IntelligenceDec-29-2017

A novel data-driven stochastic robust optimization (DDSRO) framework is proposed for optimization under uncertainty leveraging labeled multi-class uncertainty data. Uncertainty data in large datasets are often collected from various conditions, which are encoded by class labels. Machine learning methods including Dirichlet process mixture model and maximum likelihood estimation are employed for uncertainty modeling. A DDSRO framework is further proposed based on the data-driven uncertainty model through a bi-level optimization structure. The outer optimization problem follows a two-stage stochastic programming approach to optimize the expected objective across different data classes; adaptive robust optimization is nested as the inner problem to ensure the robustness of the solution while maintaining computational tractability. A decomposition-based algorithm is further developed to solve the resulting multi-level optimization problem efficiently. Case studies on process network design and planning are presented to demonstrate the applicability of the proposed framework and algorithm.

optimization problem, renewable energy, uncertainty data, (16 more...)

arXiv.org Artificial Intelligence

1707.09198

Country: North America > United States > New York (0.14)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (1.00)
Materials > Chemicals (0.94)
Energy > Renewable (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Optimization and Testing in Linear Non-Gaussian Component Analysis

Jin, Ze, Risk, Benjamin B., Matteson, David S.

arXiv.org Machine LearningDec-29-2017

The ICA model is subject to a constraint that at most one of these components is Gaussian, which is required for model identifiability. Linear non-Gaussian component analysis (LNGCA) generalizes the ICA model to a linear latent factor model with any number of both non-Gaussian components (signals) and Gaussian components (noise), where observations are linear combinations of independent components. Although the individual Gaussian components are not identifiable, the Gaussian subspace is identifiable. We introduce an estimator along with its optimization approach in which non-Gaussian and Gaussian components are estimated simultaneously, maximizing the discrepancy of each non-Gaussian component from Gaussianity while minimizing the discrepancy of each Gaussian component from Gaussianity. When the number of non-Gaussian components is unknown, we develop a statistical test to determine it based on resampling and the discrepancy of estimated components. Through a variety of simulation studies, we demonstrate the improvements of our estimator over competing estimators, and we illustrate the effectiveness of the test to determine the number of non-Gaussian components. Further, we apply our method to real data examples and demonstrate its practical value.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Machine Learning

1712.08837

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Dual Based DSP Bidding Strategy and its Application

Liu, Huahui, Zhu, Mingrui, Meng, Xiaonan, Hu, Yi, Wang, Hao

arXiv.org Machine LearningDec-27-2017

In recent years, RTB(Real Time Bidding) becomes a popular online advertisement trading method. During the auction, each DSP(Demand Side Platform) is supposed to evaluate current opportunity and respond with an ad and corresponding bid price. It's essential for DSP to find an optimal ad selection and bid price determination strategy which maximizes revenue or performance under budget and ROI(Return On Investment) constraints in P4P(Pay For Performance) or P4U(Pay For Usage) mode. We solve this problem by 1) formalizing the DSP problem as a constrained optimization problem, 2) proposing the augmented MMKP(Multi-choice Multi-dimensional Knapsack Problem) with general solution, 3) and demonstrating the DSP problem is a special case of the augmented MMKP and deriving specialized strategy. Our strategy is verified through simulation and outperforms state-of-the-art strategies in real application. To the best of our knowledge, our solution is the first dual based DSP bidding framework that is derived from strict second price auction assumption and generally applicable to the multiple ads scenario with various objectives and constraints.

banking & finance, constraint, optimization problem, (15 more...)

arXiv.org Machine Learning

1705.09416

Country: North America > United States (0.31)

Genre: Research Report (0.50)

Industry:

Marketing (1.00)
Banking & Finance > Trading (1.00)
Energy > Oil & Gas (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

IHT dies hard: Provable accelerated Iterative Hard Thresholding

Khanna, Rajiv, Kyrillidis, Anastasios

arXiv.org Machine LearningDec-26-2017

We study --both in theory and practice-- the use of momentum motions in classic iterative hard thresholding (IHT) methods. By simply modifying plain IHT, we investigate its convergence behavior on convex optimization criteria with non-convex constraints, under standard assumptions. In diverse scenaria, we observe that acceleration in IHT leads to significant improvements, compared to state of the art projected gradient descent and Frank-Wolfe variants. As a byproduct of our inspection, we study the impact of selecting the momentum parameter: similar to convex settings, two modes of behavior are observed --"rippling" and linear-- depending on the level of momentum.

algorithm 1, convergence, variant, (13 more...)

arXiv.org Machine Learning

1712.09379

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Memory-efficient Kernel PCA via Partial Matrix Sampling and Nonconvex Optimization: a Model-free Analysis of Local Minima

Chen, Ji, Li, Xiaodong

arXiv.org Machine LearningDec-26-2017

Kernel PCA is a widely used nonlinear dimension reduction technique in machine learning, but storing the kernel matrix is notoriously challenging when the sample size is large. Inspired by Yi et al. [2016], where the idea of partial matrix sampling followed by nonconvex optimization is proposed for matrix completion and robust PCA, we apply a similar approach to memory-efficient Kernel PCA. In theory, with no assumptions on the kernel matrix in terms of eigenvalues or eigenvectors, we established a model-free theory for the low-rank approximation based on any local minimum of the proposed objective function. As interesting byproducts, when the underlying positive semidefinite matrix is assumed to be low-rank and highly structured, corollaries of our main theorem improve the state-of-the-art results of Ge et al. [2016, 2017] for nonconvex matrix completion with no spurious local minima. Numerical experiments also show that our approach is competitive in terms of approximation accuracy compared to the well-known Nystr\"{o}m algorithm for Kernel PCA.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Machine Learning

1711.01742

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback