AITopics | Edmonton

Real-time search methods are suited for tasks in which the agent is interacting with an initially unknown environment in real time. In such simultaneous planning and learning problems, the agent has to select its actions in a limited amount of time, while sensing only a local part of the environment centered at the agent's current location. Real-time heuristic search agents select actions using a limited lookahead search and evaluating the frontier states with a heuristic function. Over repeated experiences, they refine heuristic values of states to avoid infinite loops and to converge to better solutions. The wide spread of such settings in autonomous software and hardware agents has led to an explosion of real-time search algorithms over the last two decades. Not only is a potential user confronted with a hodgepodge of algorithms, but he also faces the choice of control parameters they use. In this paper we address both problems. The first contribution is an introduction of a simple three-parameter framework (named LRTS) which extracts the core ideas behind many existing algorithms. We then prove that LRTA*, epsilon-LRTA*, SLA*, and gamma-Trap algorithms are special cases of our framework. Thus, they are unified and extended with additional features. Second, we prove completeness and convergence of any algorithm covered by the LRTS framework. Third, we prove several upper-bounds relating the control parameters and solution quality. Finally, we analyze the influence of the three control parameters empirically in the realistic scalable domains of real-time navigation on initially unknown maps from a commercial role-playing game as well as routing in ad hoc sensor networks.

algorithm, execution cost, heuristic value, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1789

AI Access Foundation

10440

Journal of Artificial Intelligence Research

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)
North America > United States > New Jersey > Essex County > Newark (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(4 more...)

Genre: Research Report (0.45)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Architecture > Real Time Systems (1.00)

Add feedback

Temporal-Difference Networks

Sutton, Richard S., Tanner, Brian

Neural Information Processing SystemsDec-31-2005

We introduce a generalization of temporal-difference (TD) learning to networks of interrelated predictions. Rather than relating a single prediction to itself at a later time, as in conventional TD methods, a TD network relates each prediction in a set of predictions to other predictions in the set at a later time. TD networks can represent and apply TD learning to a much wider class of predictions than has previously been possible. Using a random-walk example, we show that these networks can be used to learn to predict by a fixed interval, which is not possible with conventional TD methods. Secondly, we show that if the interpredictive relationships are made conditional on action, then the usual learning-efficiency advantage of TD methods over Monte Carlo (supervised learning) methods becomes particularly pronounced.

prediction, question network, td network, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Semi-Markov Conditional Random Fields for Information Extraction

Sarawagi, Sunita, Cohen, William W.

Neural Information Processing SystemsDec-31-2005

We describe semi-Markov conditional random fields (semi-CRFs), a conditionally trained version of semi-Markov chains. Intuitively, a semi-CRF on an input sequence x outputs a "segmentation" of x, in which labels are assigned to segments (i.e., subsequences) of x rather than to individual elements x

algorithm, dictionary feature, random field, (15 more...)

Neural Information Processing Systems

Country:

Asia > India (0.05)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Coarticulation in Markov Decision Processes

Rohanimanesh, Khashayar, Platt, Robert, Mahadevan, Sridhar, Grupen, Roderic

Neural Information Processing SystemsDec-31-2005

We investigate an approach for simultaneously committing to multiple activities, each modeled as a temporally extended action in a semi-Markov decision process (SMDP). For each activity we define a set of admissible solutions consisting of the redundant set of optimal policies, and those policies that ascend the optimal statevalue function associated with them. A plan is then generated by merging them in such a way that the solutions to the subordinate activities are realized in the set of admissible solutions satisfying the superior activities. We present our theoretical results and empirically evaluate our approach in a simulated domain.

admissible policy, controller, subgoal, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.29)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Add feedback

Maximum Likelihood Estimation of Intrinsic Dimension

Levina, Elizaveta, Bickel, Peter J.

Neural Information Processing SystemsDec-31-2005

We propose a new method for estimating intrinsic dimension of a dataset derived by applying the principle of maximum likelihood to the distances between close neighbors. We derive the estimator by a Poisson process approximation, assess its bias and variance theoretically and by simulations, and apply it to a number of simulated and real datasets. We also show it has the best overall performance compared with two other intrinsic dimension estimators.

dimension, estimator, intrinsic dimension, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Convergence and No-Regret in Multiagent Learning

Bowling, Michael

Neural Information Processing SystemsDec-31-2005

Learning in a multiagent system is a challenging problem due to two key factors. First, if other agents are simultaneously learning then the environment is no longer stationary, thus undermining convergence guarantees. Second, learning is often susceptible to deception, where the other agents may be able to exploit a learner's particular dynamics. In the worst case, this could result in poorer performance than if the agent was not learning at all. These challenges are identifiable in the two most common evaluation criteria for multiagent learning algorithms: convergence and regret.

algorithm, convergence, giga-wolf, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Industry: Leisure & Entertainment > Games (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Markov Networks for Detecting Overalpping Elements in Sequence Data

Craven, Mark, Bockhorst, Joseph

Neural Information Processing SystemsDec-31-2005

Many sequential prediction tasks involve locating instances of patterns in sequences. Generative probabilistic language models, such as hidden Markov models (HMMs), have been successfully applied to many of these tasks. A limitation of these models however, is that they cannot naturally handle cases in which pattern instances overlap in arbitrary ways. We present an alternative approach, based on conditional Markov networks, that can naturally represent arbitrarily overlapping elements. We show how to efficiently train and perform inference with these models. Experimental results from a genomics domain show that our models are more accurate at locating instances of overlapping patterns than are baseline models based on HMMs.

pattern type, sequence, terminator, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Missouri > St. Louis County > St. Louis (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
(2 more...)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Temporal-Difference Networks

Sutton, Richard S., Tanner, Brian

Neural Information Processing SystemsDec-31-2005

We introduce a generalization of temporal-difference (TD) learning to networks of interrelated predictions. Rather than relating a single prediction to itself at a later time, as in conventional TD methods, a TD network relates each prediction in a set of predictions to other predictions in the set at a later time. TD networks can represent and apply TD learning to a much wider class of predictions than has previously been possible. Using a random-walk example, we show that these networks can be used to learn to predict by a fixed interval, which is not possible with conventional TD methods. Secondly, we show that if the interpredictive relationships are made conditional on action, then the usual learning-efficiency advantage of TD methods over Monte Carlo (supervised learning) methods becomes particularly pronounced.

prediction, question network, td network, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Semi-Markov Conditional Random Fields for Information Extraction

Sarawagi, Sunita, Cohen, William W.

Neural Information Processing SystemsDec-31-2005

We describe semi-Markov conditional random fields (semi-CRFs), a conditionally trained version of semi-Markov chains. Intuitively, a semi-CRF on an input sequence x outputs a "segmentation" of x, in which labels are assigned to segments (i.e., subsequences) of x rather than to individual elements x

algorithm, dictionary feature, random field, (15 more...)

Neural Information Processing Systems

Country:

Asia > India (0.05)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Coarticulation in Markov Decision Processes

Rohanimanesh, Khashayar, Platt, Robert, Mahadevan, Sridhar, Grupen, Roderic

Neural Information Processing SystemsDec-31-2005

We investigate an approach for simultaneously committing to multiple activities, each modeled as a temporally extended action in a semi-Markov decision process (SMDP). For each activity we define a set of admissible solutions consisting of the redundant set of optimal policies, and those policies that ascend the optimal statevalue function associated with them. A plan is then generated by merging them in such a way that the solutions to the subordinate activities are realized in the set of admissible solutions satisfying the superior activities. We present our theoretical results and empirically evaluate our approach in a simulated domain.

admissible policy, controller, subgoal, (13 more...)

Neural Information Processing Systems

Country: