Goto

Collaborating Authors

 Agents


On Local Rewards and Scaling Distributed Reinforcement Learning

Neural Information Processing Systems

We consider the scaling of the number of examples necessary to achieve good performance in distributed, cooperative, multi-agent reinforcement learning, as a function of the the number of agents n. We prove a worstcase lowerbound showing that algorithms that rely solely on a global reward signal to learn policies confront a fundamental limit: They require anumber of real-world examples that scales roughly linearly in the number of agents. For settings of interest with a very large number of agents, this is impractical. We demonstrate, however, that there is a class of algorithms that, by taking advantage of local reward signals in large distributed Markov Decision Processes, are able to ensure good performance witha number of samples that scales as O(log n). This makes them applicable even in settings with a very large number of agents n.


Policy-Gradient Methods for Planning

Neural Information Processing Systems

Probabilistic temporal planning attempts to find good policies for acting in domains with concurrent durative tasks, multiple uncertain outcomes, and limited resources. These domains are typically modelled as Markov decision problems and solved using dynamic programming methods. This paper demonstrates the application of reinforcement learning -- in the form of a policy-gradient method -- to these domains. Our emphasis is large domains that are infeasible for dynamic programming. Our approach isto construct simple policies, or agents, for each planning task. The result is a general probabilistic temporal planner, named the Factored Policy-Gradient Planner (FPG-Planner), which can handle hundreds of tasks, optimising for probability of success, duration, and resource use.


Resource Allocation Among Agents with MDP-Induced Preferences

Journal of Artificial Intelligence Research

Allocating scarce resources among agents to maximize global utility is, in general, computationally challenging. We focus on problems where resources enable agents to execute actions in stochastic environments, modeled as Markov decision processes (MDPs), such that the value of a resource bundle is defined as the expected value of the optimal MDP policy realizable given these resources. We present an algorithm that simultaneously solves the resource-allocation and the policy-optimization problems. This allows us to avoid explicitly representing utilities over exponentially many resource bundles, leading to drastic (often exponential) reductions in computational complexity. We then use this algorithm in the context of self-interested agents to design a combinatorial auction for allocating resources. We empirically demonstrate the effectiveness of our approach by showing that it can, in minutes, optimally solve problems for which a straightforward combinatorial resource-allocation technique would require the agents to enumerate up to 2^100 resource bundles and the auctioneer to solve an NP-complete problem with an input of that size.


Reports on the Twenty-First National Conference on Artificial Intelligence (AAAI-06) Workshop Program

AI Magazine

The Workshop program of the Twenty-First Conference on Artificial Intelligence was held July 16-17, 2006 in Boston, Massachusetts. The program was chaired by Joyce Chai and Keith Decker. The titles of the 17 workshops were AIDriven Technologies for Service-Oriented Computing; Auction Mechanisms for Robot Coordination; Cognitive Modeling and Agent-Based Social Simulations, Cognitive Robotics; Computational Aesthetics: Artificial Intelligence Approaches to Beauty and Happiness; Educational Data Mining; Evaluation Methods for Machine Learning; Event Extraction and Synthesis; Heuristic Search, Memory- Based Heuristics, and Their Applications; Human Implications of Human-Robot Interaction; Intelligent Techniques in Web Personalization; Learning for Search; Modeling and Retrieval of Context; Modeling Others from Observations; and Statistical and Empirical Approaches for Spoken Dialogue Systems.


AI Meets Web 2.0: Building the Web of Tomorrow, Today

AI Magazine

Imagine an Internet-scale knowledge system where people and intelligent agents can collaborate on solving complex problems in business, engineering, science, medicine, and other endeavors. Its resources include semantically tagged websites, wikis, and blogs, as well as social networks, vertical search engines, and a vast array of web services from business processes to AI planners and domain models. Research prototypes of decentralized knowledge systems have been demonstrated for years, but now, thanks to the web and Moore's law, they appear ready for prime time. This article introduces the architectural concepts for incrementally growing an Internet-scale knowledge system and illustrates them with scenarios drawn from e-commerce, e-science, and e-life.


AAAI's National and Innovative Applications Conferences Celebrate 50 Years of AI

AI Magazine

The celebration then moved to web and integrated intelligence, as on Artificial Intelligence and Boston where a huge turnout of AAAI well as the nectar and senior member the Nineteenth Innovative Applications fellows--from founding luminaries to papers, is a significant factor in this of Artificial Intelligence Conference 2006 fellow inductees--reported a trend." Senior member papers are a commemorated fifty years of great weekend meeting prior to the way to collect reflections about areas artificial intelligence research in AAAI conference full of discussions of work by leaders in the field.


Reports on the Twenty-First National Conference on Artificial Intelligence (AAAI-06) Workshop Program

AI Magazine

The Workshop program of the Twenty-First Conference on Artificial Intelligence was held July 16-17, 2006 in Boston, Massachusetts. The program was chaired by Joyce Chai and Keith Decker. The titles of the 17 workshops were AIDriven Technologies for Service-Oriented Computing; Auction Mechanisms for Robot Coordination; Cognitive Modeling and Agent-Based Social Simulations, Cognitive Robotics; Computational Aesthetics: Artificial Intelligence Approaches to Beauty and Happiness; Educational Data Mining; Evaluation Methods for Machine Learning; Event Extraction and Synthesis; Heuristic Search, Memory- Based Heuristics, and Their Applications; Human Implications of Human-Robot Interaction; Intelligent Techniques in Web Personalization; Learning for Search; Modeling and Retrieval of Context; Modeling Others from Observations; and Statistical and Empirical Approaches for Spoken Dialogue Systems.


AI Meets Web 2.0: Building the Web of Tomorrow, Today

AI Magazine

Imagine an Internet-scale knowledge system where people and intelligent agents can collaborate on solving complex problems in business, engineering, science, medicine, and other endeavors. Its resources include semantically tagged websites, wikis, and blogs, as well as social networks, vertical search engines, and a vast array of web services from business processes to AI planners and domain models. Research prototypes of decentralized knowledge systems have been demonstrated for years, but now, thanks to the web and Moore's law, they appear ready for prime time. This article introduces the architectural concepts for incrementally growing an Internet-scale knowledge system and illustrates them with scenarios drawn from e-commerce, e-science, and e-life.


Multi-Issue Negotiation with Deadlines

Journal of Artificial Intelligence Research

Now, there are a number of different procedures that can be used for this process; the three main ones being the package deal procedure in which all the issues are bundled and discussed together, the simultaneous procedure in which the issues are discussed simultaneously but independently of each other, and the sequential procedure in which the issues are discussed one after another. Since each of them yields a different outcome, a key problem is to decide which one to use in which circumstances. Specifically, we consider this question for a model in which the agents have time constraints (in the form of both deadlines and discount factors) and information uncertainty (in that the agents do not know the opponent's utility function). For this model, we consider issues that are both independent and those that are interdependent and determine equilibria for each case for each procedure. In so doing, we show that the package deal is in fact the optimal procedure for each party. We then go on to show that, although the package deal may be computationally more complex than the other two procedures, it generates Pareto optimal outcomes (unlike the other two), it has similar earliest and latest possible times of agreement to the simultaneous procedure (which is better than the sequential procedure), and that it (like the other two procedures) generates a unique outcome only under certain conditions (which we define).


Distributed Control of Microscopic Robots in Biomedical Applications

arXiv.org Artificial Intelligence

Current developments in molecular electronics, motors and chemical sensors could enable constructing large numbers of devices able to sense, compute and act in micron-scale environments. Such microscopic machines, of sizes comparable to bacteria, could simultaneously monitor entire populations of cells individually in vivo. This paper reviews plausible capabilities for microscopic robots and the physical constraints due to operation in fluids at low Reynolds number, diffusion-limited sensing and thermal noise from Brownian motion. Simple distributed controls are then presented in the context of prototypical biomedical tasks, which require control decisions on millisecond time scales. The resulting behaviors illustrate trade-offs among speed, accuracy and resource use. A specific example is monitoring for patterns of chemicals in a flowing fluid released at chemically distinctive sites. Information collected from a large number of such devices allows estimating properties of cell-sized chemical sources in a macroscopic volume. The microscopic devices moving with the fluid flow in small blood vessels can detect chemicals released by tissues in response to localized injury or infection. We find the devices can readily discriminate a single cell-sized chemical source from the background chemical concentration, providing high-resolution sensing in both time and space. By contrast, such a source would be difficult to distinguish from background when diluted throughout the blood volume as obtained with a blood sample.