AITopics

Country:

Europe (0.93)
North America > United States > Indiana (0.14)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Amant, Robert St., Young, R. Michael

Interface Agents in Model World Environments

AI MagazineDec-15-2001

Choosing an environment is an important decision for agent developers. A key issue in this decision is whether the environment will provide realistic problems for the agent to solve, in the sense that the problems are true to the issues that arise in addressing a particular research question. In addition to realism, other important issues include how tractable problems are that can be formulated in the environment, how easy agent performance can be measured, and whether the environment can be customized or extended for specific research questions. In the ideal environment, researchers can pose realistic but tractable problems to an agent, measure and evaluate its performance, and iteratively rework the environment to explore increasingly ambitious questions, all at a reasonable cost in time and effort. As might be expected, trade-offs dominate the suitability of an environment; however, we have found that the modern graphic user interface offers a good balance among these trade-offs. This article takes a brief tour of agent research in the user interface, showing how significant questions related to vision, planning, learning, cognition, and communication are currently being addressed.

computer game, human computer interaction, neural network, (20 more...)

Country:

North America > United States > California (0.46)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (0.67)
Government (0.67)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Baxter, J., Bartlett, P. L.

Infinite-Horizon Policy-Gradient Estimation

Journal of Artificial Intelligence ResearchNov-1-2001

Gradient-based approaches to direct policy search in reinforcement learning have received much recent attention as a means to solve problems of partial observability and to avoid some of the problems associated with policy degradation in value-function methods. In this paper we introduce GPOMDP, a simulation-based algorithm for generating a biased estimate of the gradient of the average reward in Partially Observable Markov Decision Processes POMDPs controlled by parameterized stochastic policies. A similar algorithm was proposed by (Kimura et al. 1995). The algorithm's chief advantages are that it requires storage of only twice the number of policy parameters, uses one free beta (which has a natural interpretation in terms of bias-variance trade-off), and requires no knowledge of the underlying state. We prove convergence of GPOMDP, and show how the correct choice of the parameter beta is related to the mixing time of the controlled POMDP. We briefly describe extensions of GPOMDP to controlled Markov chains, continuous state, observation and control spaces, multiple-agents, higher-order derivatives, and a version for training stochastic policies with internal states. In a companion paper (Baxter et al., this volume) we show how the gradient estimates generated by GPOMDP can be used in both a traditional stochastic gradient algorithm and a conjugate-gradient procedure to find local optima of the average reward.

algorithm, artificial intelligence, machine learning, (16 more...)

doi: 10.1613/jair.806

AI Access Foundation

10289

Country: North America > United States > California (0.28)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Journal of Artificial Intelligence ResearchNov-1-2001

Experiments with Infinite-Horizon, Policy-Gradient Estimation

Baxter, J., Bartlett, P. L., Weaver, L.

In this paper, we present algorithms that perform gradient ascent of the average reward in a partially observable Markov decision process (POMDP). These algorithms are based on GPOMDP, an algorithm introduced in a companion paper (Baxter & Bartlett, this volume), which computes biased estimates of the performance gradient in POMDPs. The algorithm's chief advantages are that it uses only one free parameter beta, which has a natural interpretation in terms of bias-variance trade-off, it requires no knowledge of the underlying state, and it can be applied to infinite state, control and observation spaces. We show how the gradient estimates produced by GPOMDP can be used to perform gradient ascent, both with a traditional stochastic-gradient algorithm, and with an algorithm based on conjugate-gradients that utilizes gradient information to bracket maxima in line searches. Experimental results are presented illustrating both the theoretical results of (Baxter & Bartlett, this volume) on a toy problem, and practical aspects of the algorithms on a number of more realistic problems.

artificial intelligence, controller, machine learning, (16 more...)

doi: 10.1613/jair.807

AI Access Foundation

10290

Country: North America > United States (0.46)

Industry:

Telecommunications (0.46)
Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

AltAlt: Combining Graphplan and Heuristic State Search

Srivastava, Biplav, Nguyen, XuanLong, Kambhampati, Subbarao, Do, Minh B., Nambiar, Ullas, Nie, Zaiqing, Nigenda, Romeo, Zimmerman, Terry

AI MagazineSep-15-2001

We briefly describe the implementation and evaluation of a novel plan synthesis system, called AltAlt. AltAlt is designed to exploit the complementary strengths of two of the currently popular competing approaches for plan generation: (1) graphplan and (2) heuristic state search. It uses the planning graph to derive effective heuristics that are then used to guide heuristic state search. The heuristics derived from the planning graph do a better job of taking the subgoal interactions into account and, as such, are significantly more effective than existing heuristics. AltAlt was implemented on top of two state-of-the-art planning systems: (1) stan3.0, a graphplan-style planner, and (2) hsp-r, a heuristic search planner.

artificial intelligence, kambhampati, planning & scheduling, (18 more...)

Country: North America > United States (0.30)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Creativity at the Metalevel: AAAI-2000 Presidential Address

Buchanan, Bruce G.

AI MagazineSep-15-2001

Creativity is sometimes taken to be an inexplicable aspect of human activity. By summarizing a considerable body of literature on creativity, I hope to show how to turn some of the best ideas about creativity into programs that are demonstrably more creative than any we have seen to date. I believe the key to building more creative programs is to give them the ability to reflect on and modify their own frameworks and criteria. That is, I believe that the key to creativity is at the metalevel.

creativity & intelligence, neural network, scientific discovery, (21 more...)

Country:

Europe (0.93)
North America > United States > California > Los Angeles County (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Personal > Opinion (0.34)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.67)
Materials > Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.93)

Doherty, Patrick, Kvarnstram, Jonas

TALplanner: A Temporal Logic-Based Planner

AI MagazineSep-15-2001

TALplanner is a forward-chaining planner that utilizes domain-dependent knowledge to control search in the state space generated by action invocation. The domain-dependent control knowledge, background knowledge, plans, and goals are all represented using formulas in a temporal logic called tal, which has been developed independently as a formalism for specifying agent narratives and reasoning about them. In the Fifth International Artificial Intelligence Planning and Scheduling Conference planning competition, TALplanner exhibited impressive performance, winning the Outstanding Performance Award in the Domain-Dependent Planning Competition. In this article, we provide an overview of TALplanner

neural network, survey article, tal planner, (20 more...)