Goto

Collaborating Authors

 Optimization


INFORMS Journal on Optimization

#artificialintelligence

INFORMS Journal on Optimization aims to publish papers in optimization with particular emphasis on data-driven optimization, optimization methods in machine learning, and exciting real-world applications of optimization. The journal also covers more traditional areas such as: convex and linear optimization; general purpose nonlinear optimization; discrete optimization (combinatorial, integer, mixed integer optimization); optimization under uncertainty (dynamic, stochastic, robust, simulation-based optimization); infinite dimensional optimization; and online optimization). Especially welcomed are contributions studying new and significant applications such as: healthcare; inventory and supply chain management; logistics; revenue management and pricing; energy; the Internet; interfaces with computer science; and finance.


Community detection in networks via nonlinear modularity eigenvectors

arXiv.org Machine Learning

Revealing a community structure in a network or dataset is a central problem arising in many scientific areas. The modularity function $Q$ is an established measure quantifying the quality of a community, being identified as a set of nodes having high modularity. In our terminology, a set of nodes with positive modularity is called a \textit{module} and a set that maximizes $Q$ is thus called \textit{leading module}. Finding a leading module in a network is an important task, however the dimension of real-world problems makes the maximization of $Q$ unfeasible. This poses the need of approximation techniques which are typically based on a linear relaxation of $Q$, induced by the spectrum of the modularity matrix $M$. In this work we propose a nonlinear relaxation which is instead based on the spectrum of a nonlinear modularity operator $\mathcal M$. We show that extremal eigenvalues of $\mathcal M$ provide an exact relaxation of the modularity measure $Q$, however at the price of being more challenging to be computed than those of $M$. Thus we extend the work made on nonlinear Laplacians, by proposing a computational scheme, named \textit{generalized RatioDCA}, to address such extremal eigenvalues. We show monotonic ascent and convergence of the method. We finally apply the new method to several synthetic and real-world data sets, showing both effectiveness of the model and performance of the method.


Comprehensive Feature-Based Landscape Analysis of Continuous and Constrained Optimization Problems Using the R-Package flacco

arXiv.org Machine Learning

Choosing the best-performing optimizer(s) out of a portfolio of optimization algorithms is usually a difficult and complex task. It gets even worse, if the underlying functions are unknown, i.e., so-called Black-Box problems, and function evaluations are considered to be expensive. In the case of continuous single-objective optimization problems, Exploratory Landscape Analysis (ELA) - a sophisticated and effective approach for characterizing the landscapes of such problems by means of numerical values before actually performing the optimization task itself - is advantageous. Unfortunately, until now it has been quite complicated to compute multiple ELA features simultaneously, as the corresponding code has been - if at all - spread across multiple platforms or at least across several packages within these platforms. This article presents a broad summary of existing ELA approaches and introduces flacco, an R-package for feature-based landscape analysis of continuous and constrained optimization problems. Although its functions neither solve the optimization problem itself nor the related "Algorithm Selection Problem (ASP)", it offers easy access to an essential ingredient of the ASP by providing a wide collection of ELA features on a single platform - even within a single package. In addition, flacco provides multiple visualization techniques, which enhance the understanding of some of these numerical features, and thereby make certain landscape properties more comprehensible. On top of that, we will introduce the package's build-in, as well as web-hosted and hence platform-independent, graphical user interface (GUI), which facilitates the usage of the package - especially for people who are not familiar with R - making it a very convenient toolbox when working towards algorithm selection of continuous single-objective optimization problems.


Diverse Weighted Bipartite b-Matching

arXiv.org Artificial Intelligence

Bipartite matching, where agents on one side of a market are matched to agents or items on the other, is a classical problem in computer science and economics, with widespread application in healthcare, education, advertising, and general resource allocation. A practitioner's goal is typically to maximize a matching market's economic efficiency, possibly subject to some fairness requirements that promote equal access to resources. A natural balancing act exists between fairness and efficiency in matching markets, and has been the subject of much research. In this paper, we study a complementary goal-- balancing diversity and efficiency--in a generalization of bipartite matching where agents on one side of the market can be matched to sets of agents on the other. Adapting a classical definition of the diversity of a set, we propose a quadratic programming-based approach to solving a supermodular minimization problem that balances diversity and total weight of the solution. We also provide a scalable greedy algorithm with theoretical performance bounds. We then define the price of diversity, a measure of the efficiency loss due to enforcing diversity, and give a worst-case theoretical bound. Finally, we demonstrate the efficacy of our methods on three real-world datasets, and show that the price of diversity is not bad in practice. Our code is publicly accessible for further research.


Data Analysis Method: Mathematics Optimization to Build Decision Making

@machinelearnbot

To mention some, among others, conic programming, semi definite programming, semi infinite programming and some meta heuristic techniques. For now, much software help is needed to solve the wrong problem found to get the optimal solution with computation time not too long. Successful application of optimization techniques requires at least three conditions. These requirements are the ability to make mathematical models of problems encountered, knowledge of optimization techniques and knowledge of computer programs.


Subset Selection with Shrinkage: Sparse Linear Modeling when the SNR is low

arXiv.org Machine Learning

We study the behavior of a fundamental tool in sparse statistical modeling --the best-subset selection procedure (aka "best-subsets"). Assuming that the underlying linear model is sparse, it is well known, both in theory and in practice, that the best-subsets procedure works extremely well in terms of several statistical metrics (prediction, estimation and variable selection) when the signal to noise ratio (SNR) is high. However, its performance degrades substantially when the SNR is low -- it is outperformed in predictive accuracy by continuous shrinkage methods, such as ridge regression and the Lasso. We explain why this behavior should not come as a surprise, and contend that the original version of the classical best-subsets procedure was, perhaps, not designed to be used in the low SNR regimes. We propose a close cousin of best-subsets, namely, its $\ell_{q}$-regularized version, for $q \in\{1, 2\}$, which (a) mitigates, to a large extent, the poor predictive performance of best-subsets in the low SNR regimes; (b) performs favorably and generally delivers a substantially sparser model when compared to the best predictive models available via ridge regression and the Lasso. Our estimator can be expressed as a solution to a mixed integer second order conic optimization problem and, hence, is amenable to modern computational tools from mathematical optimization. We explore the theoretical properties of the predictive capabilities of the proposed estimator and complement our findings via several numerical experiments.


Gradient-enhanced kriging for high-dimensional problems

arXiv.org Machine Learning

Surrogate models provide a low computational cost alternative to evaluating expensive functions. The construction of accurate surrogate models with large numbers of independent variables is currently prohibitive because it requires a large number of function evaluations. Gradient-enhanced kriging has the potential to reduce the number of function evaluations for the desired accuracy when efficient gradient computation, such as an adjoint method, is available. However, current gradient-enhanced kriging methods do not scale well with the number of sampling points due to the rapid growth in the size of the correlation matrix where new information is added for each sampling point in each direction of the design space. They do not scale well with the number of independent variables either due to the increase in the number of hyperparameters that needs to be estimated. To address this issue, we develop a new gradient-enhanced surrogate model approach that drastically reduced the number of hyperparameters through the use of the partial-least squares method that maintains accuracy. In addition, this method is able to control the size of the correlation matrix by adding only relevant points defined through the information provided by the partial-least squares method. To validate our method, we compare the global accuracy of the proposed method with conventional kriging surrogate models on two analytic functions with up to 100 dimensions, as well as engineering problems of varied complexity with up to 15 dimensions. We show that the proposed method requires fewer sampling points than conventional methods to obtain the desired accuracy, or provides more accuracy for a fixed budget of sampling points. In some cases, we get over 3 times more accurate models than a bench of surrogate models from the literature, and also over 3200 times faster than standard gradient-enhanced kriging models.


How AI-enabled Real-time Optimization Will Shape Content in Future

#artificialintelligence

Find out how to optimize your website to give your customers experiences that will have the biggest ROI for your business. AI or Artificial Intelligence is an attempt by the human mind to make machine function like a human brain with predictive analysis. As and when I happen to listen to the buzzword'AI', the charming figures from Hollywood movies like Jarvis, Samantha, and Hal pop up before my eyes. There seems to be the incredible possibility of a well-tuned relationship of a customer and seller, especially by the innovative blending with marketing. It is doubtless that AI has a great role to play in the coming days, making our lives easier and comfortable.


Identifying global optimality for dictionary learning

arXiv.org Machine Learning

Learning new representations of input observations in machine learning is often tackled using a factorization of the data. For many such problems, including sparse coding and matrix completion, learning these factorizations can be difficult, in terms of efficiency and to guarantee that the solution is a global minimum. Recently, a general class of objectives have been introduced--which we term induced dictionary learning models (DLMs)--that have an induced convex form that enables global optimization. Though attractive theoretically, this induced form is impractical, particularly for large or growing datasets. In this work, we investigate the use of practical alternating minimization algorithms for induced DLMs, that ensure convergence to global optima. We characterize the stationary points of these models, and, using these insights, highlight practical choices for the objectives. We then provide theoretical and empirical evidence that alternating minimization, from a random initialization, converges to global minima for a large subclass of induced DLMs. In particular, we take advantage of the existence of the (potentially unknown) convex induced form, to identify when stationary points are global minima for the dictionary learning objective. We then provide an empirical investigation into practical optimization choices for using alternating minimization for induced DLMs, for both batch and stochastic gradient descent.


Learning Approximately Objective Priors

arXiv.org Machine Learning

Informative Bayesian priors are often difficult to elicit, and when this is the case, modelers usually turn to noninformative or objective priors. However, objective priors such as the Jeffreys and reference priors are not tractable to derive for many models of interest. We address this issue by proposing techniques for learning reference prior approximations: we select a parametric family and optimize a black-box lower bound on the reference prior objective to find the member of the family that serves as a good approximation. We experimentally demonstrate the method's effectiveness by recovering Jeffreys priors and learning the Variational Autoencoder's reference prior.