AITopics

doi: 10.1613/jair.2290

10524

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(15 more...)

Genre: Research Report (0.67)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Journal of Artificial Intelligence ResearchOct-12-2007

Compressed Pattern Databases

Felner, A., Korf, R. E., Meshulam, R., Holte, R. C.

A pattern database (PDB) is a heuristic function implemented as a lookup table that stores the lengths of optimal solutions for subproblem instances. Standard PDBs have a distinct entry in the table for each subproblem instance. In this paper we investigate compressing PDBs by merging several entries into one, thereby allowing the use of PDBs that exceed available memory in their uncompressed form. We introduce a number of methods for determining which entries to merge and discuss their relative merits. These vary from domain-independent approaches that allow any set of entries in the PDB to be merged, to more intelligent methods that take into account the structure of the problem. The choice of the best compression method is based on domain-dependent attributes. We present experimental results on a number of combinatorial problems, including the four-peg Towers of Hanoi problem, the sliding-tile puzzles, and the Top-Spin puzzle. For the Towers of Hanoi, we show that the search time can be reduced by up to three orders of magnitude by using compressed PDBs compared to uncompressed PDBs of the same size. More modest improvements were observed for the other domains.

compression, mapping, pdb, (17 more...)

doi: 10.1613/jair.2241

10514

Country:

Asia > Vietnam > Hanoi > Hanoi (0.46)
North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Journal of Artificial Intelligence ResearchSep-21-2007

Graph Abstraction in Real-time Heuristic Search

Bulitko, V., Sturtevant, N., Lu, J., Yau, T.

Real-time heuristic search methods are used by situated agents in applications that require the amount of planning per move to be independent of the problem size. Such agents plan only a few actions at a time in a local search space and avoid getting trapped in local minima by improving their heuristic function over time. We extend a wide class of real-time search algorithms with automatically-built state abstraction and prove completeness and convergence of the resulting family of algorithms. We then analyze the impact of abstraction in an extensive empirical study in real-time pathfinding. Abstraction is found to improve efficiency by providing better trading offs between planning time, learning speed and other negatively correlated performance measures.

abstraction, algorithm, graph, (14 more...)

doi: 10.1613/jair.2293

10510

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology > Software (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Journal of Artificial Intelligence ResearchApr-29-2007

Consistency and Random Constraint Satisfaction Models

Gao, Y., Culberson, J.

In this paper, we study the possibility of designing non-trivial random CSP models by exploiting the intrinsic connection between structures and typical-case hardness. We show that constraint consistency, a notion that has been developed to improve the efficiency of CSP algorithms, is in fact the key to the design of random CSP models that have interesting phase transition behavior and guaranteed exponential resolution complexity without putting much restriction on the parameter of constraint tightness or the domain size of the problem. We propose a very flexible framework for constructing problem instances with interesting behavior and develop a variety of concrete methods to construct specific random CSP models that enforce different levels of constraint consistency. A series of experimental studies with interesting observations are carried out to illustrate the effectiveness of introducing structural elements in random instances, to verify the robustness of our proposal, and to investigate features of some specific models based on our framework that are highly related to the behavior of backtracking search algorithms.

constraint, graph, resolution complexity, (14 more...)

doi: 10.1613/jair.2155

10495

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > Canada > British Columbia > Regional District of Central Okanagan > Kelowna (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Taskar, Ben, Lacoste-Julien, Simon, Jordan, Michael I.

Structured Prediction via the Extragradient Method

We present a simple and scalable algorithm for large-margin estimation of structured models, including an important class of Markov networks and combinatorial models. We formulate the estimation problem as a convex-concave saddle-point problem and apply the extragradient method, yielding an algorithm with linear convergence using simple gradient and projection calculations. The projection step can be solved using combinatorial algorithms for min-cost quadratic flow. This makes the approach an efficient alternative to formulations based on reductions to a quadratic program (QP). We present experiments on two very different structured prediction tasks: 3D image segmentation and word alignment, illustrating the favorable scaling properties of our algorithm.

algorithm, alignment, word alignment, (17 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.05)
Asia > Middle East > Jordan (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Africa > Middle East > Egypt > Giza Governorate > Giza (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Rafols, Eddie, Koop, Anna, Sutton, Richard S.

Temporal Abstraction in Temporal-difference Networks

We present a generalization of temporal-difference networks to include temporally abstract options on the links of the question network. Temporal-difference (TD) networks have been proposed as a way of representing and learning a wide variety of predictions about the interaction between an agent and its environment. These predictions are compositional in that their targets are defined in terms of other predictions, and subjunctive in that that they are about what would happen if an action or sequence of actions were taken. In conventional TD networks, the interrelated predictions are at successive time steps and contingent on a single action; here we generalize them to accommodate extended time intervals and contingency on whole ways of behaving. Our generalization is based on the options framework for temporal abstraction. The primary contribution of this paper is to introduce a new algorithm for intra-option learning in TD networks with function approximation and eligibility traces.

agent, prediction, td network, (17 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Precup, Doina, Paduraru, Cosmin, Koop, Anna, Sutton, Richard S., Singh, Satinder P.

Off-policy Learning with Options and Recognizers

We introduce a new algorithm for off-policy temporal-difference learning with function approximation that has lower variance and requires less knowledge of the behavior policy than prior methods. We develop the notion of a recognizer, a filter on actions that distorts the behavior policy to produce a related target policy with low-variance importance-sampling corrections. We also consider target policies that are deviations from the state distribution of the behavior policy, such as potential temporally abstract options, which further reduces variance. This paper introduces recognizers and their potential advantages, then develops a full algorithm for linear function approximation and proves that its updates are in the same direction as on-policy TD updates, which implies asymptotic convergence. Even though our algorithm is based on importance sampling, we prove that it requires absolutely no knowledge of the behavior policy for the case of state-aggregation function approximators.

algorithm, behavior policy, recognizer, (14 more...)

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.15)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Mccracken, Peter, Bowling, Michael

Online Discovery and Learning of Predictive State Representations

Predictive state representations (PSRs) are a method of modeling dynamical systems using only observable data, such as actions and observations, to describe their model. PSRs use predictions about the outcome of future tests to summarize the system state. The best existing techniques for discovery and learning of PSRs use a Monte Carlo approach to explicitly estimate these outcome probabilities. In this paper, we present a new algorithm for discovery and learning of PSRs that uses a gradient descent approach to compute the predictions for the current state. The algorithm takes advantage of the large amount of structure inherent in a valid prediction matrix to constrain its predictions. Furthermore, the algorithm can be used online by an agent to constantly improve its prediction quality; something that current state of the art discovery and learning algorithms are unable to do. We give empirical results to show that our constrained gradient algorithm is able to discover core tests using very small amounts of data, and with larger amounts of data can compute accurate predictions of the system dynamics.

algorithm, core test, prediction, (16 more...)

Country: North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Engel, Yaakov, Szabo, Peter, Volkinshtein, Dmitry

Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods

The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet unknown. Robotic arms based on the same mechanical principles may render present day robotic arms obsolete. In this paper, we tackle this control problem using an online reinforcement learning algorithm, based on a Bayesian approach to policy evaluation known as Gaussian process temporal difference (GPTD) learning. Our substitute for the real arm is a computer simulation of a 2-dimensional model of an Octopus arm. Even with the simplifications inherent to this model, the state space we face is a high-dimensional one. We apply a GPTDbased algorithm to this domain, and demonstrate its operation on several learning tasks of varying degrees of difficulty.

algorithm, experiment, octopus arm, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Taskar, Ben, Lacoste-Julien, Simon, Jordan, Michael I.

Structured Prediction via the Extragradient Method

We present a simple and scalable algorithm for large-margin estimation of structured models, including an important class of Markov networks and combinatorial models. We formulate the estimation problem as a convex-concave saddle-point problem and apply the extragradient method, yielding an algorithm with linear convergence using simple gradient and projection calculations. The projection step can be solved using combinatorial algorithms for min-cost quadratic flow. This makes the approach an efficient alternative to formulations based on reductions to a quadratic program (QP). We present experiments on two very different structured prediction tasks: 3D image segmentation and word alignment, illustrating the favorable scaling properties of our algorithm.

algorithm, alignment, word alignment, (17 more...)