AITopics

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Farias, Daniela D., Roy, Benjamin V.

A Cost-Shaping LP for Bellman Error Minimization with Performance Guarantees

Neural Information Processing SystemsDec-31-2005

We introduce a new algorithm based on linear programming that approximates the differential value function of an average-cost Markov decision process via a linear combination of pre-selected basis functions. The algorithm carries out a form of cost shaping and minimizes a version of Bellman error. We establish an error bound that scales gracefully with the number of states without imposing the (strong) Lyapunov condition required by its counterpart in[6]. We propose a path-following method that automates selection of important algorithm parameters which represent counterparts tothe "state-relevance weights" studied in [6].

algorithm, artificial intelligence, optimization problem, (18 more...)

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Wainwright, Martin J., Jaakkola, Tommi S., Willsky, Alan S.

MAP estimation via agreement on (hyper)trees: Message-passing and linear programming

arXiv.org Artificial IntelligenceAug-15-2005

We develop and analyze methods for computing provably optimal {\em maximum a posteriori} (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of tree-structured distributions, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is tight if and only if all the tree distributions share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original distribution. Next we develop two approaches to attempting to obtain tight upper bounds: (a) a {\em tree-relaxed linear program} (LP), which is derived from the Lagrangian dual of the upper bounds; and (b) a {\em tree-reweighted max-product message-passing algorithm} that is related to but distinct from the max-product algorithm. In this way, we establish a connection between a certain LP relaxation of the mode-finding problem, and a reweighted form of the max-product (min-sum) message-passing algorithm.

algorithm, artificial intelligence, optimization problem, (20 more...)

arXiv.org Artificial Intelligence

cs/0508070

Country:

Europe (0.92)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Watson, J. P., Whitley, L. D., Howe, A. E.

Linking Search Space Structure, Run-Time Dynamics, and Problem Difficulty: A Step Toward Demystifying Tabu Search

Journal of Artificial Intelligence ResearchAug-1-2005

Tabu search is one of the most effective heuristics for locating high-quality solutions to a diverse array of NP-hard combinatorial optimization problems. Despite the widespread success of tabu search, researchers have a poor understanding of many key theoretical aspects of this algorithm, including models of the high-level run-time dynamics and identification of those search space features that influence problem difficulty. We consider these questions in the context of the job-shop scheduling problem (JSP), a domain where tabu search algorithms have been shown to be remarkably effective. Previously, we demonstrated that the mean distance between random local optima and the nearest optimal solution is highly correlated with problem difficulty for a well-known tabu search algorithm for the JSP introduced by Taillard. In this paper, we discuss various shortcomings of this measure and develop a new model of problem difficulty that corrects these deficiencies. We show that Taillard's algorithm can be modeled with high fidelity as a simple variant of a straightforward random walk. The random walk model accounts for nearly all of the variability in the cost required to locate both optimal and sub-optimal solutions to random JSPs, and provides an explanation for differences in the difficulty of random versus structured JSPs. Finally, we discuss and empirically substantiate two novel predictions regarding tabu search algorithm behavior. First, the method for constructing the initial solution is highly unlikely to impact the performance of tabu search. Second, tabu tenure should be selected to be as small as possible while simultaneously avoiding search stagnation; values larger than necessary lead to significant degradations in performance.

artificial intelligence, null, optimization problem, (12 more...)

doi: 10.1613/jair.1576

AI Access Foundation

10419

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Journal of Artificial Intelligence ResearchJul-1-2005

Risk-Sensitive Reinforcement Learning Applied to Control under Constraints

Geibel, P., Wysotzki, F.

In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with respect to a policy as the probability of entering such a state when the policy is pursued. We consider the problem of finding good policies whose risk is smaller than some user-specified threshold, and formalize it as a constrained MDP with two criteria. The first criterion corresponds to the value function originally given. We will show that the risk can be formulated as a second criterion function based on a cumulative return, whose definition is independent of the original value function. We present a model free, heuristic reinforcement learning algorithm that aims at finding good deterministic policies. It is based on weighting the original value function and the risk. The weight parameter is adapted in order to find a feasible solution for the constrained problem that has a good performance with respect to the value function. The algorithm was successfully applied to the control of a feed tank with stochastic inflows that lies upstream of a distillation column. This control task was originally formulated as an optimal control problem with chance constraints, and it was solved under certain assumptions on the model to obtain an optimal solution. The power of our learning algorithm is that it can be used even when some of these restrictive assumptions are relaxed.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

doi: 10.1613/jair.1666

AI Access Foundation

10415

Country:

North America > United States > California > San Francisco County > San Francisco (0.28)
North America > United States > Massachusetts (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Roudenko, Olga, Schoenauer, Marc

Dominance Based Crossover Operator for Evolutionary Multi-objective Algorithms

arXiv.org Artificial IntelligenceMay-29-2005

In spite of the recent quick growth of the Evolutionary Multi-objective Optimization (EMO) research field, there has been few trials to adapt the general variation operators to the particular context of the quest for the Pareto-optimal set. The only exceptions are some mating restrictions that take in account the distance between the potential mates - but contradictory conclusions have been reported. This paper introduces a particular mating restriction for Evolutionary Multi-objective Algorithms, based on the Pareto dominance relation: the partner of a non-dominated individual will be preferably chosen among the individuals of the population that it dominates. Coupled with the BLX crossover operator, two different ways of generating offspring are proposed. This recombination scheme is validated within the well-known NSGA-II framework on three bi-objective benchmark problems and one real-world bi-objective constrained optimization problem. An acceleration of the progress of the population toward the Pareto set is observed on all problems.

artificial intelligence, operator, optimization problem, (12 more...)

arXiv.org Artificial Intelligence

cs/0505080

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Andrews, Stuart, Hofmann, Thomas

Multiple Instance Learning via Disjunctive Programming Boosting

Learning from ambiguous training data is highly relevant in many applications. We present a new learning algorithm for classification problems where labels are associated with sets of pattern instead of individual patterns. This encompasses multiple instance learning as a special case. Our approach is based on a generalization of linear programming boosting and uses results from disjunctive programming to generate successively stronger linear relaxations of a discrete non-convex problem.

constraint, inductive learning, optimization problem, (18 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)

Jordan, Michael I., Wainwright, Martin J.

Semidefinite Relaxations for Approximate Inference on Graphs with Cycles

We present a new method for calculating approximate marginals for probability distributions defined by graphs with cycles, based on a Gaussian entropybound combined with a semidefinite outer bound on the marginal polytope. This combination leads to a log-determinant maximization problemthat can be solved by efficient interior point methods [8]. As with the Bethe approximation and its generalizations [12], the optimizing arguments of this problem can be taken as approximations to the exact marginals. In contrast to Bethe/Kikuchi approaches, our variational problemis strictly convex and so has a unique global optimum. An additional desirable feature is that the value of the optimal solution is guaranteed to provide an upper bound on the log partition function. In experimental trials, the performance of the log-determinant relaxation is comparable to or better than the sum-product algorithm, and by a substantial marginfor certain problem classes. Finally, the zero-temperature limit of our log-determinant relaxation recovers a class of well-known semidefinite relaxations for integer programming [e.g., 3].

artificial intelligence, constraint, optimization problem, (16 more...)

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Poupart, Pascal, Boutilier, Craig

Bounded Finite State Controllers

We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic finite state controllers, combining several advantages of gradient ascent (efficiency, search through restricted controller space) and policy iteration (less vulnerability to local optima).

node, optimization problem, us government, (18 more...)

Country:

North America > United States (1.00)
North America > Canada > Ontario > Toronto (0.30)

Industry: Government > Regional Government > North America Government > United States Government (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Hauskrecht, Milos, Kveton, Branislav

Linear Program Approximations for Factored Continuous-State Markov Decision Processes

Approximate linear programming (ALP) has emerged recently as one of the most promising methods for solving complex factored MDPs with finite state spaces. In this work we show that ALP solutions are not limited only to MDPs with finite state spaces, but that they can also be applied successfully to factored continuous-state MDPs (CMDPs). We show how one can build an ALPbased approximation for such a model and contrast it to existing solution methods. We argue that this approach offers a robust alternative for solving high dimensional continuous-state space problems. The point is supported by experiments on three CMDP problems with 24-25 continuous state factors.

artificial intelligence, constraint, optimization problem, (18 more...)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.51)