Recent work on approximate linear programming (ALP) techniques for first-order Markov Decision Processes (FOMDPs) represents the value function linearly w.r.t. a set of first-order basis functions and uses linear programming techniques to determine suitable weights. This approach offers the advantage that it does not require simplification of the first-order value function, and allows one to solve FOMDPs independent of a specific domain instantiation. In this paper, we address several questions to enhance the applicability of this work: (1) Can we extend the first-order ALP framework to approximate policy iteration to address performance deficiencies of previous approaches? (2) Can we automatically generate basis functions and evaluate their impact on value function quality? (3) How can we decompose intractable problems with universally quantified rewards into tractable subproblems? We propose answers to these questions along with a number of novel optimizations and provide a comparative empirical evaluation on logistics problems from the ICAPS 2004 Probabilistic Planning Competition.
Finally, all of the standard algorithms Consider a system consisting of multiple interacting are based on a centralized computation paradigm: the agents, each trying to achieve its goals. Even if the entire game description is transmitted to some central agents are not in direct competition, they might not location, where the entire equilibrium profile is computed and then transmitted to the agents.
The curse of dimensionality gives rise to prohibitive computational requirements that render infeasible the exact solution of large-scale stochastic control problems. We study an efficient method based on linear programming for approximating solutions to such problems. Theapproach "fits" a linear combination of pre-selected basis functions to the dynamic programming cost-to- go function. We develop bounds on the approximation error and present experimental resultsin the domain of queueing network control, providing empirical support for the methodology.
This paper presents a novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful, dimension-independent approximation and sample complexity guarantees. In particular, we establish both theoretically and computationally that our proposal can serve as a viable alternative to state-of-the-art parametric ADP algorithms, freeing the designer from carefully specifying an approximation architecture. We accomplish this by developing a kernel-based mathematical program for ADP. Via a computational study on a controlled queueing network, we show that our non-parametric procedure is competitive with parametric ADP approaches.
Many computational settings are concerned with finding (all) models of a first-order logic theory for a fixed, finite domain. In this paper, we present a method to compute from a given theory and finite domain an approximate structure: a structure that approximates all models. We show confluence of this method and investigate its complexity.