AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Algorithm Runtime Prediction: Methods & Evaluation

Hutter, Frank, Xu, Lin, Hoos, Holger H., Leyton-Brown, Kevin

arXiv.org Artificial IntelligenceOct-26-2013

Perhaps surprisingly, it is possible to predict how long an algorithm will take to run on a previously unseen input, using machine learning techniques to build a model of the algorithm's runtime as a function of problem-specific instance features. Such models have important applications to algorithm analysis, portfolio-based algorithm selection, and the automatic configuration of parameterized algorithms. Over the past decade, a wide variety of techniques have been studied for building such models. Here, we describe extensions and improvements of existing models, new families of models, and -- perhaps most importantly -- a much more thorough treatment of algorithm parameters as model inputs. We also comprehensively describe new and existing features for predicting algorithm runtime for propositional satisfiability (SAT), travelling salesperson (TSP) and mixed integer programming (MIP) problems. We evaluate these innovations through the largest empirical analysis of its kind, comparing to a wide range of runtime modelling techniques from the literature. Our experiments consider 11 algorithms and 35 instance distributions; they also span a very wide range of SAT, MIP, and TSP instances, with the least structured having been generated uniformly at random and the most structured having emerged from real industrial applications. Overall, we demonstrate that our new models yield substantially better runtime predictions than previous approaches in terms of their generalization to new problem instances, to new algorithms from a parameterized space, and to both simultaneously.

logic & formal reasoning, machine learning, prediction, (23 more...)

arXiv.org Artificial Intelligence

1211.0906

Country:

North America > Canada (0.28)
North America > United States (0.27)
Asia > Vietnam > Long An Province (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry: Information Technology (0.68)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
(8 more...)

Add feedback

A Survey of Multi-Objective Sequential Decision-Making

Roijers, D. M., Vamplew, P., Whiteson, S., Dazeley, R.

Journal of Artificial Intelligence ResearchOct-18-2013

Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research in decision-theoretic planning and learning, which has largely focused on single-objective settings. This article surveys algorithms designed for sequential decision-making problems with multiple objectives. Though there is a growing body of literature on this subject, little of it makes explicit under what circumstances special methods are needed to solve multi-objective problems. Therefore, we identify three distinct scenarios in which converting such a problem to a single-objective one is impossible, infeasible, or undesirable. Furthermore, we propose a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function (which projects multi-objective values to scalar ones), and the type of policies considered. We show how these factors determine the nature of an optimal solution, which can be a single policy, a convex hull, or a Pareto front. Using this taxonomy, we survey the literature on multi-objective methods for planning and learning. Finally, we discuss key applications of such methods and outline opportunities for future work.

momdp, objective, scalarization function, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3987

AI Access Foundation

10836

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(10 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Energy > Power Industry (1.00)
Banking & Finance (1.00)
Transportation > Ground > Road (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Duality between subgradient and conditional gradient methods

Bach, Francis

arXiv.org Machine LearningOct-18-2013

Many problems in machine learning, statistics and signal processing may be cast as convex optimization problems. In large-scale situations, simple gradient-based algorithms with potentially many cheap iterations are often preferred over methods, such as Newton's method or interior-point methods, that rely on fewer but more expensive iterations. The choice of a first-order method depends on the structure of the problem, in particular (a) the smoothness and/or strong convexity of the objective function, and (b) the computational efficiency of certain operations related to the non-smooth parts of the objective function, when it is decomposable in a smooth and a non-smooth part. In this paper, we consider two classical algorithms, namely (a) subgradient descent and its mirror descent extension [29, 24, 4], and (b) conditional gradient algorithms, sometimes referred to as Frank-Wolfe algorithms [16, 13, 15, 14, 19]. Subgradient algorithms are adapted to non-smooth unstructured situations, and after t steps have a convergence rate of O(1/ t) in terms of objective values.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1211.6302

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Inference, Sampling, and Learning in Copula Cumulative Distribution Networks

Webb, Stefan Douglas

arXiv.org Machine LearningOct-16-2013

The cumulative distribution network (CDN) is a recently developed class of probabilistic graphical models (PGMs) permitting a copula factorization, in which the CDF, rather than the density, is factored. Despite there being much recent interest within the machine learning community about copula representations, there has been scarce research into the CDN, its amalgamation with copula theory, and no evaluation of its performance. Algorithms for inference, sampling, and learning in these models are underdeveloped compared those of other PGMs, hindering widerspread use. One advantage of the CDN is that it allows the factors to be parameterized as copulae, combining the benefits of graphical models with those of copula theory. In brief, the use of a copula parameterization enables greater modelling flexibility by separating representation of the marginals from the dependence structure, permitting more efficient and robust learning. Another advantage is that the CDN permits the representation of implicit latent variables, whose parameterization and connectivity are not required to be specified. Unfortunately, that the model can encode only latent relationships between variables severely limits its utility. In this thesis, we present inference, learning, and sampling for CDNs, and further the state-of-the-art. First, we explain the basics of copula theory and the representation of copula CDNs. Then, we discuss inference in the models, and develop the first sampling algorithm. We explain standard learning methods, propose an algorithm for learning from data missing completely at random (MCAR), and develop a novel algorithm for learning models of arbitrary treewidth and size. Properties of the models and algorithms are investigated through Monte Carlo simulations. We conclude with further discussion of the advantages and limitations of CDNs, and suggest future work.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1310.4456

Country: Europe (0.27)

Genre: Research Report (0.81)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Add feedback

Sharp Inequalities for $f$-divergences

Guntuboyina, Adityanand, Saha, Sujayam, Schiebinger, Geoffrey

arXiv.org Machine LearningOct-15-2013

Suppose that the Kullback-Leibler divergence between two probability measures is bounded from above by 2. What then is the maximum possible value of the Hellinger distance between them? Such questions naturally arise in many fields including mathematical statistics and machine learning, information theory, probability, statistical physics etc. and the goal of this paper is to provide a way of answering them. From the variational viewpoint, this problem can be posed as: maximize the Hellinger distance subject to a constraint on the Kullback-Leibler divergence over the space of all pairs of probability measures over all possible sample spaces. We shall prove in this paper that the value of this maximization problem remains unchanged if one restricts the sample space to be the three-element set{ 1, 2, 3} . In other words, in order to find the maximum Hellinger distance subject to an upper bound on the Kullback-Leibler divergence, one can just restrict attention to pairs of probability measures on {1, 2, 3}. Thus, the large infinite-dimensional optimization problem is reduced to an optimization problem over a small finite-dimensional space (of dimension 4) which makes it tractable.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

1302.0336

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Variance Adjusted Actor Critic Algorithms

Tamar, Aviv, Mannor, Shie

arXiv.org Machine LearningOct-14-2013

In Reinforcement Learning (RL; [1]) and planning in Markov Decision Processes (MDPs; [2]), the typical objective is to maximize the cumulative (possibly discounted) expected reward, denoted by J. When the model's parameters are known, several well-established and efficient optimization algorithms are known. When the model parameters are not known, learning is needed and there are several algorithmic frameworks that solve the learning problem effectively, at least when the model is finite. Among these, actor-critic methods [3] are known to be particularly efficient. In typical actor-critic algorithms, the critic maintains an estimate of the value function - the expected reward-to-go. This function is then used by the actor to estimate the gradient of the objective with respect to some policy parameters, and then improve the policy by modifying the parameters in the direction of the gradient. The theory that underlies actor-critic algorithms is the policy gradient theorem [4], which relates the value function with the policy gradient.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

1310.3697

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback

Confidence-constrained joint sparsity recovery under the Poisson noise model

Chunikhina, E., Raich, R., Nguyen, T.

arXiv.org Machine LearningOct-9-2013

Our work is focused on the joint sparsity recovery problem where the common sparsity pattern is corrupted by Poisson noise. We formulate the confidence-constrained optimization problem in both least squares (LS) and maximum likelihood (ML) frameworks and study the conditions for perfect reconstruction of the original row sparsity and row sparsity pattern. However, the confidence-constrained optimization problem is non-convex. Using convex relaxation, an alternative convex reformulation of the problem is proposed. We evaluate the performance of the proposed approach using simulation results on synthetic data and show the effectiveness of proposed row sparsity and row sparsity pattern recovery framework.

artificial intelligence, optimization problem, row sparsity, (13 more...)

arXiv.org Machine Learning

1309.1193

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Double four-bar crank-slider mechanism dynamic balancing by meta-heuristic algorithms

Emdadi, Habib, Yazdanian, Mahsa, Ettefagh, Mir Mohammad, Feizi-Derakhshi, Mohammad-Reza

arXiv.org Artificial IntelligenceOct-8-2013

In this paper, a new method for dynamic balancing of double four-bar crank slider mechanism by meta- heuristic-based optimization algorithms is proposed. For this purpose, a proper objective function which is necessary for balancing of this mechanism and corresponding constraints has been obtained by dynamic modeling of the mechanism. Then PSO, ABC, BGA and HGAPSO algorithms have been applied for minimizing the defined cost function in optimization step. The optimization results have been studied completely by extracting the cost function, fitness, convergence speed and runtime values of applied algorithms. It has been shown that PSO and ABC are more efficient than BGA and HGAPSO in terms of convergence speed and result quality. Also, a laboratory scale experimental doublefour-bar crank-slider mechanism was provided for validating the proposed balancing method practically.

artificial intelligence, evolutionary algorithm, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.5121/ijaia.2013.4501

1310.2089

Country: Asia > Middle East > Iran (0.16)

Genre: Research Report (0.40)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.71)

Add feedback

Online Learning of Dynamic Parameters in Social Networks

Shahrampour, Shahin, Rakhlin, Alexander, Jadbabaie, Ali

arXiv.org Machine LearningOct-1-2013

This paper addresses the problem of online learning in a dynamic setting. We consider a social network in which each individual observes a private signal about the underlying state of the world and communicates with her neighbors at each time period. Unlike many existing approaches, the underlying state is dynamic, and evolves according to a geometric random walk. We view the scenario as an optimization problem where agents aim to learn the true state while suffering the smallest possible loss. Based on the decomposition of the global loss function, we introduce two update mechanisms, each of which generates an estimate of the true state. We establish a tight bound on the rate of change of the underlying state, under which individuals can track the parameter with a bounded variance. Then, we characterize explicit expressions for the steady state mean-square deviation(MSD) of the estimates from the truth, per individual. We observe that only one of the estimators recovers the optimal MSD, which underscores the impact of the objective function decomposition on the learning quality. Finally, we provide an upper bound on the regret of the proposed methods, measured as an average of errors in estimating the parameter in a finite time.

artificial intelligence, machine learning, msd, (18 more...)

arXiv.org Machine Learning

1310.0432

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.40)

Industry:

Information Technology > Services (0.61)
Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
(2 more...)

Add feedback

Structured learning of sum-of-submodular higher order energy functions

Fix, Alexander, Joachims, Thorsten, Park, Sam, Zabih, Ramin

arXiv.org Machine LearningSep-30-2013

Submodular functions can be exactly minimized in polynomial time, and the special case that graph cuts solve with max flow [18] has had significant impact in computer vision [5, 20, 27]. In this paper we address the important class of sum-of-submodular (SoS) functions [2, 17], which can be efficiently minimized via a variant of max flow called submodular flow [6]. SoS functions can naturally express higher order priors involving, e.g., local image patches; however, it is difficult to fully exploit their expressive power because they have so many parameters. Rather than trying to formulate existing higher order priors as an SoS function, we take a discriminative learning approach, effectively searching the space of SoS functions for a higher order prior that performs well on our training set. We adopt a structural SVM approach [14, 33] and formulate the training problem in terms of quadratic programming; as a result we can efficiently search the space of SoS priors via an extended cutting-plane algorithm. We also show how the state-of-the-art max flow method for vision problems [10] can be modified to efficiently solve the submodular flow problem. Experimental comparisons are made against the OpenCV implementation of the GrabCut interactive segmentation technique [27], which uses hand-tuned parameters instead of machine learning. On a standard dataset [11] our method learns higher order priors with hundreds of parameter values, and produces significantly better segmentations. While our focus is on binary labeling problems, we show that our techniques can be naturally generalized to handle more than two labels.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1309.7512

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Add feedback